127 Commits

Author SHA1 Message Date
265c9f462f feat: musicbrainz overall search
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-09-16 13:12:50 +02:00
780daac0ef feat: Musicbrainz oriantation and class creation
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-07-03 10:48:44 +02:00
465af49057 hotfix
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-06-03 10:19:32 +02:00
2aa0f02fa5 Merge branch 'adding_genius' into experimental
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-23 13:36:10 +02:00
7b0b830d64 feat: removed legacy key
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2024-05-23 13:24:25 +02:00
1ba6c97f5a feat: more extensive browse id 2024-05-23 13:20:34 +02:00
c8cbfc7cb9 feat: improved output of clearing the cache 2024-05-23 13:17:14 +02:00
344da0a0bf fix: converting pictures to rgb before saving
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-22 15:20:26 +02:00
49dc7093c8 fix: genius fallback
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-22 15:18:43 +02:00
90f70638b4 feat: better lyrics support
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-21 17:55:08 +02:00
7b4eee858a feat: parsed script json
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-21 17:14:58 +02:00
f61b34dd40 feat: improved feature artists by also adding writer and producer to it
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-21 16:52:01 +02:00
688b4fd357 feat: getting the album tracklist
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-21 16:47:38 +02:00
769d27dc5c feat: album details
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-21 16:43:52 +02:00
f5d953d9ce feat: theoretically fetching feature songs
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-21 16:34:04 +02:00
46b64b8f8d feat: fetched the flat artist details 2024-05-21 16:23:05 +02:00
adfce16d2a feat: fetched the flat artist details 2024-05-21 16:21:58 +02:00
e4fd9faf12 feat: detecting url type 2024-05-21 15:57:09 +02:00
f6caee41a8 feat: finished searching genious
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-21 15:52:41 +02:00
068c749c38 feat: implemented artist search 2024-05-21 15:27:10 +02:00
c131924577 Merge pull request 'fix/clean_feature_artists' (#38) from fix/clean_feature_artists into experimental
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/tag/woodpecker Pipeline was successful
Reviewed-on: #38
2024-05-21 12:08:04 +00:00
8cdb5c1f99 fix: tests were a mess and didn't properly test the functionality but random things that worked with implementation
All checks were successful
ci/woodpecker/pr/woodpecker Pipeline was successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pull_request_closed/woodpecker Pipeline was successful
2024-05-21 13:52:20 +02:00
356ba658ce fix: lyrics enpoint could crash the whole program 2024-05-21 13:37:26 +02:00
000a6c0dba feat: added type identifier to option strings 2024-05-17 18:24:56 +02:00
83a3334f1a changed default value
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
2024-05-17 13:55:58 +02:00
ab61ff7e9b feat: added the option to select all at options at the same time
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
2024-05-17 13:37:49 +02:00
3cb35909d1 feat: added option to just fetch the objects in select 2024-05-17 13:31:46 +02:00
e87075a809 feat: changed syncing from event based to reference
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
2024-05-16 17:39:12 +02:00
86e985acec fix: files don't contain id anymore
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
2024-05-16 17:30:53 +02:00
a70a24d93e feat: minor adjustments
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
2024-05-16 17:14:18 +02:00
2c1ac0f12d fix: wrong collection 2024-05-16 17:09:36 +02:00
897897dba2 feat: added recursive structure 2024-05-16 14:29:50 +02:00
adcf26b518 feat: renamed main album collection to album collection
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
2024-05-16 14:10:00 +02:00
8ccc28daf8 draft: added feature artist collection attr 2024-05-16 14:09:34 +02:00
2b3f4d82d9 feat: renamed main_artist_collection to artist_collection
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
2024-05-16 14:05:33 +02:00
41a91a6afe feat: removing dependence beteen artist.album and album.artist 2024-05-16 13:41:06 +02:00
82df96a193 Merge pull request 'feature/move_download_code_to_download' (#34) from feature/move_download_code_to_download into experimental
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Reviewed-on: #34
2024-05-15 15:24:30 +00:00
80ad2727de fix: stream retry
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
ci/woodpecker/pull_request_closed/woodpecker Pipeline was successful
2024-05-15 17:14:01 +02:00
19b83ce880 fix: saving streaming progress on retry
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
2024-05-15 15:04:00 +02:00
1bf04439f0 fix: setting the genre of the song
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
2024-05-15 14:51:30 +02:00
bab6aeb45d fix: removed double linebreaks from formated text, plaintext
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
2024-05-15 14:26:19 +02:00
98afe5047d fix: wrong creation of source types
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
2024-05-15 14:21:15 +02:00
017752c4d0 feat: better download output
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
2024-05-15 14:10:32 +02:00
ea4c73158e fix: audio format is replaced completely
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
2024-05-15 13:58:44 +02:00
0096dfe5cb feat: copying the downloaded music into the final locations
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
2024-05-15 13:17:36 +02:00
bedd0fe819 fix: runtime errors
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-15 13:16:11 +02:00
ac6c513d56 draft: post process song
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-15 12:30:54 +02:00
cc14253239 draft: streaming the audio 2024-05-15 12:18:08 +02:00
14f986a497 draft: rewrote sources 2024-05-15 11:44:39 +02:00
da8887b279 draft: rewriting soure
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-14 15:18:17 +02:00
Hellow
bb32fc7647 draft: rewriting downloading
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-14 00:28:05 +02:00
Hellow
8c369d79e4 draft: rewriting downloading
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-13 21:51:32 +02:00
Hellow
b09d6f2691 draft: rewriting downloading 2024-05-13 21:45:12 +02:00
0e6fe8187a feat: fetch_from_url
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-13 18:09:11 +02:00
0343c11a62 feat: migrated fetch details and from source
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-13 18:03:20 +02:00
9769cf4033 Merge pull request 'fix/caching_signatures' (#32) from fix/caching_signatures into experimental
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Reviewed-on: #32
2024-05-13 15:15:37 +00:00
55024bd987 fix: key error
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
ci/woodpecker/pull_request_closed/woodpecker Pipeline was successful
2024-05-13 17:15:15 +02:00
d85498869d feat: tracksort and albumsort + some other stuff
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-13 14:22:33 +02:00
c3350b016d fix: timeout for yt music stream
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-13 13:39:57 +02:00
788103a68e fix: removed invalid stuff
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-13 13:28:54 +02:00
5179c64161 Merge branch 'experimental' of ssh://gitea.elara.ws:2222/music-kraken/music-kraken-core into experimental
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-10 17:53:39 +02:00
04405f88eb Merge branch 'fix/musify_scrapes_year_as_artist' into experimental 2024-05-10 17:52:11 +02:00
acd183c90e fix: bandcamp
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
ci/woodpecker/pull_request_closed/woodpecker Pipeline was successful
2024-05-10 17:39:30 +02:00
7186f06ce6 feat: improved interface
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-10 17:33:07 +02:00
6e354af0d1 feat: added proper settings
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-10 17:06:40 +02:00
155f239c8a feat: changed ids for audio tempfiles to random id instead of increment id
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-10 15:32:14 +02:00
36db651dfa fix: cleaning the song name deleted the song if the song name was the same as the artist name
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-10 15:25:11 +02:00
8426f6e2ea fix: filtered another year
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-10 15:20:22 +02:00
75d0a83d14 fix: changed dependency
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-09 10:57:55 +02:00
Hellow
2af577c0cd fix: removed empty objects
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-08 21:06:40 +02:00
Hellow
3780f05e58 feat: added launch.json 2024-05-08 16:48:27 +02:00
Hellow
a0305a7a6e fix: don't add year as artist 2024-05-08 16:47:56 +02:00
949583225a Merge pull request 'Correct duplicate values' (#22) from issue16 into experimental
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Reviewed-on: #22
2024-05-08 12:33:34 +00:00
4e0b005170 Merge branch 'experimental' into issue16
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
ci/woodpecker/pull_request_closed/woodpecker Pipeline was successful
2024-05-08 12:33:56 +02:00
e3d7ed8837 Merge pull request 'fix/musify_artist_spam' (#27) from fix/musify_artist_spam into experimental
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Reviewed-on: #27
2024-05-08 10:31:23 +00:00
e3e7aea959 fix for lyrics
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
2024-05-08 12:27:56 +02:00
9d4e3e8545 fix: bounds get respected
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
ci/woodpecker/pull_request_closed/woodpecker Pipeline was successful
2024-05-08 12:23:16 +02:00
9c63e8e55a fix: correct collections
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
2024-05-08 12:09:41 +02:00
a97f8872c8 fix: refetching release title from album card
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-08 09:57:11 +02:00
a5f8057b82 feat: improved initialization of data objects 2024-05-08 09:44:18 +02:00
e3e547c232 feat: improved musify 2024-05-08 09:15:41 +02:00
12c0bf6b83 Merge pull request 'ci: make tags release to the music-kraken pypi package instead of music-kraken-stable' (#24) from ci/remove-stable-package into experimental
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Reviewed-on: #24
2024-05-07 21:17:11 +00:00
ac9a74138c ci: make tags release to the music-kraken pypi package instead of music-kraken-stable
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
ci/woodpecker/pull_request_closed/woodpecker Pipeline was successful
2024-05-07 16:07:45 +00:00
960d3b74ac feat: prevent collection albums from being fetched from musify
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-07 14:59:28 +02:00
585e8c9671 Merge pull request 'feature/add_merge_command' (#23) from feature/add_merge_command into experimental
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Reviewed-on: #23
2024-05-07 12:15:07 +00:00
4f9261505e fix: skip insterval works
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
ci/woodpecker/pull_request_closed/woodpecker Pipeline was successful
2024-05-07 13:59:29 +02:00
08b9492455 fix: am source thing
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-07 13:55:09 +02:00
9d0dcb412b feat: added m string
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-07 13:34:18 +02:00
709c5ebaa8 Correct duplicate values
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
2024-05-07 12:34:24 +02:00
17c26c5140 feat: added links to wiki
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-07 11:17:36 +02:00
0a589d9c64 feat: added links to wiki
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-07 11:15:20 +02:00
8abb89ea48 feat: updated readme
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-07 10:37:02 +02:00
3951394ede feat: improved data structure docs
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-07 09:10:04 +02:00
73f26e121c feat: updated installing instructions
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-07 08:53:41 +02:00
3be6c71dcd Merge pull request 'fix/reindex_before_collection' (#21) from fix/reindex_before_collection into experimental
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Reviewed-on: #21
2024-05-06 17:36:27 +00:00
Hellow
1b22c80e5c fix: removing the possibility or file names containing /
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
ci/woodpecker/pull_request_closed/woodpecker Pipeline was successful
2024-05-06 18:48:13 +02:00
Hellow
6805d1cbe6 feat: allowed to append none to source collection
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-06 18:40:21 +02:00
Hellow
542d59562a fix: removed redundand code
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-06 18:35:25 +02:00
Hellow
131be537c8 fix: actually merging 2024-05-06 17:39:53 +02:00
ed8cc914be feat: lyrics for youtube music
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-06 16:27:49 +02:00
5ed902489f feat: added additional data
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-06 14:33:03 +02:00
90d685da81 feat: implemented correct merging of artists
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-06 12:53:06 +02:00
be7e91cb7b feat: improved the youtube music album fetching 2024-05-06 12:44:15 +02:00
7e5a1f84ae feat: improved the youtube music album fetching 2024-05-06 12:40:06 +02:00
d9105fb55a fix: some bug
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-06 10:31:21 +02:00
a7711761f9 dfa
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-03 14:55:22 +02:00
9c369b421d feat: oh no
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-05-03 14:52:12 +02:00
be843f2c10 draft: improved debug even more 2024-04-30 17:43:00 +02:00
4510520db6 feat: draft better debug
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-04-30 17:24:11 +02:00
e93f6d754c draft
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-04-30 12:32:55 +02:00
796f609d86 fix: push to 2024-04-30 09:31:38 +02:00
Hellow
312e26ec44 feat: implemented push to
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-04-30 08:11:10 +02:00
Hellow
a3ef671f00 feat: tried improving fetching
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-04-30 02:09:52 +02:00
Hellow
e9b1a12aa1 draft: the problem is in _list_renderer.py
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-04-29 23:40:48 +02:00
Hellow
3e29e1d322 draft: fix collection appending
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-04-29 22:37:07 +02:00
3737e0dc81 feat: added id possibility to output
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-04-29 18:18:57 +02:00
8e1dfd0be6 draft: added canged version
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-04-29 17:36:43 +02:00
95d1df3530 fix: not directly adding all sources
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-04-29 17:29:55 +02:00
415210522f fix: not directly adding all sources
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-04-29 17:27:12 +02:00
67f475076c feat: cleaned downloading
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-04-29 17:19:09 +02:00
8f9858da60 draft: no metadata function for source
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-04-29 17:06:31 +02:00
1971982d27 feat: added tests
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-04-29 15:31:32 +02:00
c6bdf724e3 draft: string processing
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2024-04-29 14:40:49 +02:00
aa50d2cf20 feat: renamed referrer page fixing typo 2024-04-29 13:51:43 +02:00
3eba8e90f4 feat: cleaned data objects 2024-04-29 13:49:41 +02:00
ee1aaa13b0 feat: cleaned data objects 2024-04-29 13:49:16 +02:00
1ad62df0ab feat: default implementation for options that should be sufficient 2024-04-29 13:43:34 +02:00
51 changed files with 2332 additions and 1598 deletions

22
.vscode/launch.json vendored Normal file
View File

@@ -0,0 +1,22 @@
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "Python Debugger: Current File",
"type": "debugpy",
"request": "launch",
"program": "${file}",
"console": "integratedTerminal"
},
{
"name": "Python Debugger: Download script",
"type": "debugpy",
"request": "launch",
"program": "development/actual_donwload.py",
"console": "integratedTerminal"
}
]
}

13
.vscode/settings.json vendored
View File

@@ -16,21 +16,32 @@
}, },
"python.formatting.provider": "none", "python.formatting.provider": "none",
"cSpell.words": [ "cSpell.words": [
"albumsort",
"APIC", "APIC",
"Bandcamp", "Bandcamp",
"bitrate",
"DEEZER",
"dotenv", "dotenv",
"encyclopaedia", "encyclopaedia",
"ENDC", "ENDC",
"Gitea",
"iframe",
"isrc",
"itemprop",
"levenshtein", "levenshtein",
"metallum", "metallum",
"MUSICBRAINZ",
"musify", "musify",
"OKBLUE", "OKBLUE",
"OKGREEN",
"pathvalidate", "pathvalidate",
"Referer", "Referer",
"sponsorblock", "sponsorblock",
"tracklist",
"tracksort", "tracksort",
"translit", "translit",
"unmap", "unmap",
"youtube" "youtube",
"youtubei"
] ]
} }

View File

@@ -11,7 +11,6 @@ steps:
build-stable: build-stable:
image: python image: python
commands: commands:
- sed -i 's/name = "music-kraken"/name = "music-kraken-stable"/' pyproject.toml
- python -m pip install -r requirements-dev.txt - python -m pip install -r requirements-dev.txt
- python3 -m build - python3 -m build
environment: environment:

226
README.md
View File

@@ -2,61 +2,43 @@
[![Woodpecker CI Status](https://ci.elara.ws/api/badges/59/status.svg)](https://ci.elara.ws/repos/59) [![Woodpecker CI Status](https://ci.elara.ws/api/badges/59/status.svg)](https://ci.elara.ws/repos/59)
<img src="assets/logo.svg" width=300 alt="music kraken logo"/> <img src="https://gitea.elara.ws/music-kraken/music-kraken-core/media/branch/experimental/assets/logo.svg" width=300 alt="music kraken logo"/>
- [Music Kraken](#music-kraken) - [Installation](#installation)
- [Installation](#installation) - [Quick-Guide](#quick-guide)
- [From source](#from-source) - [How to search properly](#query)
- [Notes for WSL](#notes-for-wsl) - [Matrix Space](#matrix-space)
- [Quick-Guide](#quick-guide)
- [Query](#query) If you want to use this a library or contribute, check out [the wiki](https://gitea.elara.ws/music-kraken/music-kraken-core/wiki) for more information.
- [CONTRIBUTE](#contribute)
- [Matrix Space](#matrix-space)
- [TODO till the next release](#todo-till-the-next-release)
- [Programming Interface / Use as Library](#programming-interface--use-as-library)
- [Quick Overview](#quick-overview)
- [Data Model](#data-model)
- [Data Objects](#data-objects)
- [Creation](#creation)
--- ---
## Installation ## Installation
You can find and get this project from either [PyPI](https://pypi.org/project/music-kraken/) as a Python-Package, You can find and get this project from either [PyPI](https://pypi.org/project/music-kraken/) as a Python-Package,
or simply the source code from [GitHub](https://github.com/HeIIow2/music-downloader). Note that even though or simply the source code from [Gitea](https://gitea.elara.ws/music-kraken/music-kraken-core). **
everything **SHOULD** work cross-platform, I have only tested it on Ubuntu.
If you enjoy this project, feel free to give it a star on GitHub.
> THE PyPI PACKAGE IS OUTDATED **NOTES**
- Even though everything **SHOULD** work cross-platform, I have only tested it on Ubuntu.
- If you enjoy this project, feel free to give it a star on GitHub.
### From source ### From source
if you use Debian or Ubuntu:
```sh ```sh
git clone https://github.com/HeIIow2/music-downloader git clone https://gitea.elara.ws/music-kraken/music-kraken-core.git
sudo apt install pandoc python3 -m pip install -e music-kraken-core/
cd music-downloader/
python3 -m pip install -r requirements.txt
``` ```
then you can add to `~/.bashrc` To update the program, if installed like this, go into the `music-kraken-core` directory and run `git pull`.
``` ### Get it running on other Systems
alias music-kraken='cd your/directory/music-downloader/src; python3 -m music_kraken'
alias 🥺='sudo'
```
```sh Here are the collected issues, that are related to running the program on different systems. If you have any issues, feel free to open a new one.
source ~/.bashrc
music-kraken
```
### Notes for WSL #### Windows + WSL
If you choose to run it in WSL, make sure ` ~/.local/bin` is added to your `$PATH` [#2][i2] Add ` ~/.local/bin` to your `$PATH`. [#2][i2]
## Quick-Guide ## Quick-Guide
@@ -87,10 +69,6 @@ The escape character is as usual `\`.
--- ---
## CONTRIBUTE
I am happy about every pull request. To contribute look [here](contribute.md).
## Matrix Space ## Matrix Space
<img align="right" alt="music-kraken logo" src="assets/element_logo.png" width=100> <img align="right" alt="music-kraken logo" src="assets/element_logo.png" width=100>
@@ -99,171 +77,5 @@ I decided against creating a discord server, due to various communities get ofte
**Click [this invitation](https://matrix.to/#/#music-kraken:matrix.org) _([https://matrix.to/#/#music-kraken:matrix.org](https://matrix.to/#/#music-kraken:matrix.org))_ to join.** **Click [this invitation](https://matrix.to/#/#music-kraken:matrix.org) _([https://matrix.to/#/#music-kraken:matrix.org](https://matrix.to/#/#music-kraken:matrix.org))_ to join.**
## TODO till the next release
> These Points will most likely be in the changelogs.
- [x] Migrate away from pandoc, to a more lightweight alternative, that can be installed over PiPY.
- [ ] Update the Documentation of the internal structure. _(could be pushed back one release)_
---
# Programming Interface / Use as Library
This application is $100\%$ centered around Data. Thus, the most important thing for working with musik kraken is, to understand how I structured the data.
## Quick Overview
- explanation of the [Data Model](#data-model)
- how to use the [Data Objects](#data-objects)
- further Dokumentation of _hopefully_ [most relevant classes](documentation/objects.md)
- the [old implementation](documentation/old_implementation.md)
```mermaid
---
title: Quick Overview (outdated)
---
sequenceDiagram
participant pg as Page (eg. YouTube, MB, Musify, ...)
participant obj as DataObjects (eg. Song, Artist, ...)
participant db as DataBase
obj ->> db: write
db ->> obj: read
pg -> obj: find a source for any page, for object.
obj -> pg: add more detailed data from according page.
obj -> pg: if available download audio to target.
```
## Data Model
The Data Structure, that the whole programm is built on looks as follows:
```mermaid
---
title: Music Data
---
erDiagram
Target {
}
Lyrics {
}
Song {
}
Album {
}
Artist {
}
Label {
}
Source {
}
Source }o--|| Song : ""
Source }o--|| Lyrics : ""
Source }o--|| Album : ""
Source }o--|| Artist : ""
Source }o--|| Label : ""
Song }o--o{ Album : AlbumSong
Album }o--o{ Artist : ArtistAlbum
Song }o--o{ Artist : "ArtistSong (features)"
Label }o--o{ Album : LabelAlbum
Label }o--o{ Artist : LabelSong
Song ||--o{ Lyrics : ""
Song ||--o{ Target : ""
```
Ok now this **WILL** look intimidating, thus I break it down quickly.
*That is also the reason I didn't add all Attributes here.*
The most important Entities are:
- Song
- Album
- Artist
- Label
All of them *(and Lyrics)* can have multiple Sources, and every Source can only Point to one of those Element.
The `Target` Entity represents the location on the hard drive a Song has. One Song can have multiple download Locations.
The `Lyrics` Entity simply represents the Lyrics of each Song. One Song can have multiple Lyrics, e.g. Translations.
Here is the simplified Diagramm without only the main Entities.
```mermaid
---
title: simplified Music Data
---
erDiagram
Song {
}
Album {
}
Artist {
}
Label {
}
Song }o--o{ Album : AlbumSong
Album }o--o{ Artist : ArtistAlbum
Song }o--o{ Artist : "ArtistSong (features)"
Label }o--o{ Album : LabelAlbum
Label }o--o{ Artist : LabelSong
```
Looks way more manageable, doesn't it?
The reason every relation here is a `n:m` *(many to many)* relation is not, that it makes sense in the aspekt of modeling reality, but to be able to put data from many Sources in the same Data Model.
Every Service models Data a bit different, and projecting a one-to-many relationship to a many to many relationship without data loss is easy. The other way around it is basically impossible
## Data Objects
> Not 100% accurate yet and *might* change slightly
### Creation
```python
# needs to be added
```
If you just want to start implementing, then just use the code example I provided, I don't care.
For those who don't want any bugs and use it as intended *(which is recommended, cuz I am only one person so there are defs bugs)* continue reading, and read the whole documentation, which may exist in the future xD
[i10]: https://github.com/HeIIow2/music-downloader/issues/10 [i10]: https://github.com/HeIIow2/music-downloader/issues/10
[i2]: https://github.com/HeIIow2/music-downloader/issues/2 [i2]: https://github.com/HeIIow2/music-downloader/issues/2

View File

@@ -7,7 +7,9 @@ logging.getLogger().setLevel(logging.DEBUG)
if __name__ == "__main__": if __name__ == "__main__":
commands = [ commands = [
"s: #a Crystal F", "s: #a Crystal F",
"d: 20", "10",
"1",
"3",
] ]

View File

@@ -2,30 +2,24 @@ import music_kraken
from music_kraken.objects import Song, Album, Artist, Collection from music_kraken.objects import Song, Album, Artist, Collection
if __name__ == "__main__": if __name__ == "__main__":
album_1 = Album( song_1 = Song(
title="album", title="song",
song_list=[ feature_artist_list=[Artist(
Song(title="song", main_artist_list=[Artist(name="artist")]), name="main_artist"
], )]
artist_list=[
Artist(name="artist 3"),
]
) )
album_2 = Album( other_artist = Artist(name="other_artist")
title="album",
song_list=[ song_2 = Song(
Song(title="song", main_artist_list=[Artist(name="artist 2")]), title = "song",
], artist_list=[other_artist]
artist_list=[
Artist(name="artist"),
]
) )
album_1.merge(album_2) other_artist.name = "main_artist"
print() song_1.merge(song_2)
print(*(f"{a.title_string} ; {a.id}" for a in album_1.artist_collection.data), sep=" | ")
print(id(album_1.artist_collection), id(album_2.artist_collection)) print("#" * 120)
print(id(album_1.song_collection[0].main_artist_collection), id(album_2.song_collection[0].main_artist_collection)) print("main", *song_1.artist_collection)
print("feat", *song_1.feature_artist_collection)

View File

@@ -10,12 +10,12 @@ from ..objects import Target
LOGGER = logging_settings["codex_logger"] LOGGER = logging_settings["codex_logger"]
def correct_codec(target: Target, bitrate_kb: int = main_settings["bitrate"], audio_format: str = main_settings["audio_format"], interval_list: List[Tuple[float, float]] = None): def correct_codec(target: Target, bitrate_kb: int = main_settings["bitrate"], audio_format: str = main_settings["audio_format"], skip_intervals: List[Tuple[float, float]] = None):
if not target.exists: if not target.exists:
LOGGER.warning(f"Target doesn't exist: {target.file_path}") LOGGER.warning(f"Target doesn't exist: {target.file_path}")
return return
interval_list = interval_list or [] skip_intervals = skip_intervals or []
bitrate_b = int(bitrate_kb / 1024) bitrate_b = int(bitrate_kb / 1024)
@@ -29,7 +29,7 @@ def correct_codec(target: Target, bitrate_kb: int = main_settings["bitrate"], au
start = 0 start = 0
next_start = 0 next_start = 0
for end, next_start in interval_list: for end, next_start in skip_intervals:
aselect_list.append(f"between(t,{start},{end})") aselect_list.append(f"between(t,{start},{end})")
start = next_start start = next_start
aselect_list.append(f"gte(t,{next_start})") aselect_list.append(f"gte(t,{next_start})")
@@ -47,7 +47,7 @@ def correct_codec(target: Target, bitrate_kb: int = main_settings["bitrate"], au
# run the ffmpeg command with a progressbar # run the ffmpeg command with a progressbar
ff = FfmpegProgress(ffmpeg_command) ff = FfmpegProgress(ffmpeg_command)
with tqdm(total=100, desc=f"removing {len(interval_list)} segments") as pbar: with tqdm(total=100, desc=f"processing") as pbar:
for progress in ff.run_command_with_progress(): for progress in ff.run_command_with_progress():
pbar.update(progress-pbar.n) pbar.update(progress-pbar.n)

View File

@@ -1,5 +1,5 @@
import mutagen import mutagen
from mutagen.id3 import ID3, Frame, APIC from mutagen.id3 import ID3, Frame, APIC, USLT
from pathlib import Path from pathlib import Path
from typing import List from typing import List
import logging import logging
@@ -7,6 +7,7 @@ from PIL import Image
from ..utils.config import logging_settings, main_settings from ..utils.config import logging_settings, main_settings
from ..objects import Song, Target, Metadata from ..objects import Song, Target, Metadata
from ..objects.metadata import Mapping
from ..connection import Connection from ..connection import Connection
LOGGER = logging_settings["tagging_logger"] LOGGER = logging_settings["tagging_logger"]
@@ -79,7 +80,7 @@ def write_metadata_to_target(metadata: Metadata, target: Target, song: Song):
with temp_target.open("wb") as f: with temp_target.open("wb") as f:
f.write(r.content) f.write(r.content)
converted_target: Target = Target.temp(name=f"{song.title}.jpeg") converted_target: Target = Target.temp(name=f"{song.title.replace('/', '_')}")
with Image.open(temp_target.file_path) as img: with Image.open(temp_target.file_path) as img:
# crop the image if it isn't square in the middle with minimum data loss # crop the image if it isn't square in the middle with minimum data loss
width, height = img.size width, height = img.size
@@ -92,6 +93,10 @@ def write_metadata_to_target(metadata: Metadata, target: Target, song: Song):
# resize the image to the preferred resolution # resize the image to the preferred resolution
img.thumbnail((main_settings["preferred_artwork_resolution"], main_settings["preferred_artwork_resolution"])) img.thumbnail((main_settings["preferred_artwork_resolution"], main_settings["preferred_artwork_resolution"]))
# https://stackoverflow.com/a/59476938/16804841
if img.mode != 'RGB':
img = img.convert('RGB')
img.save(converted_target.file_path, "JPEG") img.save(converted_target.file_path, "JPEG")
# https://stackoverflow.com/questions/70228440/mutagen-how-can-i-correctly-embed-album-art-into-mp3-file-so-that-i-can-see-t # https://stackoverflow.com/questions/70228440/mutagen-how-can-i-correctly-embed-album-art-into-mp3-file-so-that-i-can-see-t
@@ -105,8 +110,11 @@ def write_metadata_to_target(metadata: Metadata, target: Target, song: Song):
data=converted_target.read_bytes(), data=converted_target.read_bytes(),
) )
) )
id3_object.frames.delall("USLT")
mutagen_file = mutagen.File(target.file_path) uslt_val = metadata.get_id3_value(Mapping.UNSYNCED_LYRICS)
id3_object.frames.add(
USLT(encoding=3, lang=u'eng', desc=u'desc', text=uslt_val)
)
id3_object.add_metadata(metadata) id3_object.add_metadata(metadata)
id3_object.save() id3_object.save()

View File

@@ -6,16 +6,18 @@ import re
from .utils import cli_function from .utils import cli_function
from .options.first_config import initial_config from .options.first_config import initial_config
from ..utils import output, BColors
from ..utils.config import write_config, main_settings from ..utils.config import write_config, main_settings
from ..utils.shared import URL_PATTERN from ..utils.shared import URL_PATTERN
from ..utils.string_processing import fit_to_file_system from ..utils.string_processing import fit_to_file_system
from ..utils.support_classes.query import Query from ..utils.support_classes.query import Query
from ..utils.support_classes.download_result import DownloadResult from ..utils.support_classes.download_result import DownloadResult
from ..utils.exception import MKInvalidInputException
from ..utils.exception.download import UrlNotFoundException from ..utils.exception.download import UrlNotFoundException
from ..utils.enums.colors import BColors from ..utils.enums.colors import BColors
from .. import console from .. import console
from ..download.results import Results, Option, PageResults from ..download.results import Results, Option, PageResults, GoToResults
from ..download.page_attributes import Pages from ..download.page_attributes import Pages
from ..pages import Page from ..pages import Page
from ..objects import Song, Album, Artist, DatabaseObject from ..objects import Song, Album, Artist, DatabaseObject
@@ -164,9 +166,9 @@ class Downloader:
self.genre = genre or get_genre() self.genre = genre or get_genre()
self.process_metadata_anyway = process_metadata_anyway self.process_metadata_anyway = process_metadata_anyway
print() output()
print(f"Downloading to: \"{self.genre}\"") output(f"Downloading to: \"{self.genre}\"", color=BColors.HEADER)
print() output()
def print_current_options(self): def print_current_options(self):
self.page_dict = dict() self.page_dict = dict()
@@ -174,10 +176,8 @@ class Downloader:
print() print()
page_count = 0 page_count = 0
for option in self.current_results.formated_generator(max_items_per_page=self.max_displayed_options): for option in self.current_results.formatted_generator():
if isinstance(option, Option): if isinstance(option, Option):
_downloadable = self.pages.is_downloadable(option.music_object)
r = f"{BColors.GREY.value}{option.index:0{self.option_digits}}{BColors.ENDC.value} {option.music_object.option_string}" r = f"{BColors.GREY.value}{option.index:0{self.option_digits}}{BColors.ENDC.value} {option.music_object.option_string}"
print(r) print(r)
else: else:
@@ -226,7 +226,7 @@ class Downloader:
if album is not None: if album is not None:
song.album_collection.append(album) song.album_collection.append(album)
if artist is not None: if artist is not None:
song.main_artist_collection.append(artist) song.artist_collection.append(artist)
return Query(raw_query=query, music_object=song) return Query(raw_query=query, music_object=song)
if album is not None: if album is not None:
@@ -249,7 +249,7 @@ class Downloader:
f"Recommendations and suggestions on sites to implement appreciated.\n" f"Recommendations and suggestions on sites to implement appreciated.\n"
f"But don't be a bitch if I don't end up implementing it.") f"But don't be a bitch if I don't end up implementing it.")
return return
self.set_current_options(PageResults(page, data_object.options)) self.set_current_options(PageResults(page, data_object.options, max_items_per_page=self.max_displayed_options))
self.print_current_options() self.print_current_options()
return return
@@ -299,95 +299,128 @@ class Downloader:
self.set_current_options(self.pages.search(parsed_query)) self.set_current_options(self.pages.search(parsed_query))
self.print_current_options() self.print_current_options()
def goto(self, index: int): def goto(self, data_object: DatabaseObject):
page: Type[Page] page: Type[Page]
music_object: DatabaseObject
try: self.pages.fetch_details(data_object, stop_at_level=1)
page, music_object = self.current_results.get_music_object_by_index(index)
except KeyError:
print()
print(f"The option {index} doesn't exist.")
print()
return
self.pages.fetch_details(music_object) self.set_current_options(GoToResults(data_object.options, max_items_per_page=self.max_displayed_options))
print(music_object)
print(music_object.options)
self.set_current_options(PageResults(page, music_object.options))
self.print_current_options() self.print_current_options()
def download(self, download_str: str, download_all: bool = False) -> bool: def download(self, data_objects: List[DatabaseObject], **kwargs) -> bool:
to_download: List[DatabaseObject] = [] output()
if len(data_objects) > 1:
if re.match(URL_PATTERN, download_str) is not None: output(f"Downloading {len(data_objects)} objects...", *("- " + o.option_string for o in data_objects), color=BColors.BOLD, sep="\n")
_, music_objects = self.pages.fetch_url(download_str)
to_download.append(music_objects)
else:
index: str
for index in download_str.split(", "):
if not index.strip().isdigit():
print()
print(f"Every download thingie has to be an index, not {index}.")
print()
return False
for index in download_str.split(", "):
to_download.append(self.current_results.get_music_object_by_index(int(index))[1])
print()
print("Downloading:")
for download_object in to_download:
print(download_object.option_string)
print()
_result_map: Dict[DatabaseObject, DownloadResult] = dict() _result_map: Dict[DatabaseObject, DownloadResult] = dict()
for database_object in to_download: for database_object in data_objects:
r = self.pages.download(music_object=database_object, genre=self.genre, download_all=download_all, r = self.pages.download(
process_metadata_anyway=self.process_metadata_anyway) data_object=database_object,
genre=self.genre,
**kwargs
)
_result_map[database_object] = r _result_map[database_object] = r
for music_object, result in _result_map.items(): for music_object, result in _result_map.items():
print() output()
print(music_object.option_string) output(music_object.option_string)
print(result) output(result)
return True return True
def process_input(self, input_str: str) -> bool: def process_input(self, input_str: str) -> bool:
input_str = input_str.strip() try:
processed_input: str = input_str.lower() input_str = input_str.strip()
processed_input: str = input_str.lower()
if processed_input in EXIT_COMMANDS: if processed_input in EXIT_COMMANDS:
return True return True
if processed_input == ".": if processed_input == ".":
self.print_current_options()
return False
if processed_input == "..":
if self.previous_option():
self.print_current_options() self.print_current_options()
return False
if processed_input == "..":
if self.previous_option():
self.print_current_options()
return False
command = ""
query = processed_input
if ":" in processed_input:
_ = processed_input.split(":")
command, query = _[0], ":".join(_[1:])
do_search = "s" in command
do_fetch = "f" in command
do_download = "d" in command
do_merge = "m" in command
if do_search and (do_download or do_fetch or do_merge):
raise MKInvalidInputException(message="You can't search and do another operation at the same time.")
if do_search:
self.search(":".join(input_str.split(":")[1:]))
return False
def get_selected_objects(q: str):
if q.strip().lower() == "all":
return list(self.current_results)
indices = []
for possible_index in q.split(","):
possible_index = possible_index.strip()
if possible_index == "":
continue
i = 0
try:
i = int(possible_index)
except ValueError:
raise MKInvalidInputException(message=f"The index \"{possible_index}\" is not a number.")
if i < 0 or i >= len(self.current_results):
raise MKInvalidInputException(message=f"The index \"{i}\" is not within the bounds of 0-{len(self.current_results) - 1}.")
indices.append(i)
return [self.current_results[i] for i in indices]
selected_objects = get_selected_objects(query)
if do_merge:
old_selected_objects = selected_objects
a = old_selected_objects[0]
for b in old_selected_objects[1:]:
if type(a) != type(b):
raise MKInvalidInputException(message="You can't merge different types of objects.")
a.merge(b)
selected_objects = [a]
if do_fetch:
for data_object in selected_objects:
self.pages.fetch_details(data_object)
self.print_current_options()
return False
if do_download:
self.download(selected_objects)
return False
if len(selected_objects) != 1:
raise MKInvalidInputException(message="You can only go to one object at a time without merging.")
self.goto(selected_objects[0])
return False return False
except MKInvalidInputException as e:
output("\n" + e.message + "\n", color=BColors.FAIL)
help_message()
if processed_input.startswith("s: "):
self.search(input_str[3:])
return False
if processed_input.startswith("d: "):
return self.download(input_str[3:])
if processed_input.isdigit():
self.goto(int(processed_input))
return False
if processed_input != "help":
print(f"{BColors.WARNING.value}Invalid input.{BColors.ENDC.value}")
help_message()
return False return False
def mainloop(self): def mainloop(self):

View File

@@ -6,6 +6,7 @@ from typing import List, Optional
from functools import lru_cache from functools import lru_cache
import logging import logging
from ..utils import output, BColors
from ..utils.config import main_settings from ..utils.config import main_settings
from ..utils.string_processing import fit_to_file_system from ..utils.string_processing import fit_to_file_system
@@ -136,13 +137,13 @@ class Cache:
) )
self._write_attribute(cache_attribute) self._write_attribute(cache_attribute)
cache_path = fit_to_file_system(Path(module_path, name), hidden_ok=True) cache_path = fit_to_file_system(Path(module_path, name.replace("/", "_")), hidden_ok=True)
with cache_path.open("wb") as content_file: with cache_path.open("wb") as content_file:
self.logger.debug(f"writing cache to {cache_path}") self.logger.debug(f"writing cache to {cache_path}")
content_file.write(content) content_file.write(content)
def get(self, name: str) -> Optional[CacheResult]: def get(self, name: str) -> Optional[CacheResult]:
path = fit_to_file_system(Path(self._dir, self.module, name), hidden_ok=True) path = fit_to_file_system(Path(self._dir, self.module, name.replace("/", "_")), hidden_ok=True)
if not path.is_file(): if not path.is_file():
return None return None
@@ -165,7 +166,7 @@ class Cache:
if ca.name == "": if ca.name == "":
continue continue
file = fit_to_file_system(Path(self._dir, ca.module, ca.name), hidden_ok=True) file = fit_to_file_system(Path(self._dir, ca.module, ca.name.replace("/", "_")), hidden_ok=True)
if not ca.is_valid: if not ca.is_valid:
self.logger.debug(f"deleting cache {ca.id}") self.logger.debug(f"deleting cache {ca.id}")
@@ -204,9 +205,12 @@ class Cache:
for path in self._dir.iterdir(): for path in self._dir.iterdir():
if path.is_dir(): if path.is_dir():
for file in path.iterdir(): for file in path.iterdir():
output(f"Deleting file {file}", color=BColors.GREY)
file.unlink() file.unlink()
output(f"Deleting folder {path}", color=BColors.HEADER)
path.rmdir() path.rmdir()
else: else:
output(f"Deleting folder {path}", color=BColors.HEADER)
path.unlink() path.unlink()
self.cached_attributes.clear() self.cached_attributes.clear()

View File

@@ -317,7 +317,7 @@ class Connection:
name = kwargs.pop("description") name = kwargs.pop("description")
if progress > 0: if progress > 0:
headers = dict() if headers is None else headers headers = kwargs.get("headers", dict())
headers["Range"] = f"bytes={target.size}-" headers["Range"] = f"bytes={target.size}-"
r = self.request( r = self.request(
@@ -366,6 +366,7 @@ class Connection:
if retry: if retry:
self.LOGGER.warning(f"Retrying stream...") self.LOGGER.warning(f"Retrying stream...")
accepted_response_codes.add(206) accepted_response_codes.add(206)
stream_kwargs["progress"] = progress
return Connection.stream_into(**stream_kwargs) return Connection.stream_into(**stream_kwargs)
return DownloadResult() return DownloadResult()

View File

@@ -0,0 +1,21 @@
from dataclasses import dataclass, field
from typing import Set
from ..utils.config import main_settings
from ..utils.enums.album import AlbumType
@dataclass
class FetchOptions:
download_all: bool = False
album_type_blacklist: Set[AlbumType] = field(default_factory=lambda: set(AlbumType(a) for a in main_settings["album_type_blacklist"]))
@dataclass
class DownloadOptions:
download_all: bool = False
album_type_blacklist: Set[AlbumType] = field(default_factory=lambda: set(AlbumType(a) for a in main_settings["album_type_blacklist"]))
download_again_if_found: bool = False
process_audio_if_found: bool = False
process_metadata_if_found: bool = True

View File

@@ -1,23 +1,45 @@
from typing import Tuple, Type, Dict, Set from typing import Tuple, Type, Dict, Set, Optional, List
from collections import defaultdict
from pathlib import Path
import re
import logging
from . import FetchOptions, DownloadOptions
from .results import SearchResults from .results import SearchResults
from ..objects import DatabaseObject, Source from ..objects import (
DatabaseObject as DataObject,
from ..utils.config import youtube_settings Collection,
from ..utils.enums.source import SourcePages Target,
Source,
Options,
Song,
Album,
Artist,
Label,
)
from ..audio import write_metadata_to_target, correct_codec
from ..utils import output, BColors
from ..utils.string_processing import fit_to_file_system
from ..utils.config import youtube_settings, main_settings
from ..utils.path_manager import LOCATIONS
from ..utils.enums import SourceType, ALL_SOURCE_TYPES
from ..utils.support_classes.download_result import DownloadResult from ..utils.support_classes.download_result import DownloadResult
from ..utils.support_classes.query import Query from ..utils.support_classes.query import Query
from ..utils.support_classes.download_result import DownloadResult
from ..utils.exception import MKMissingNameException
from ..utils.exception.download import UrlNotFoundException from ..utils.exception.download import UrlNotFoundException
from ..utils.shared import DEBUG_PAGES from ..utils.shared import DEBUG_PAGES
from ..pages import Page, EncyclopaediaMetallum, Musify, YouTube, YoutubeMusic, Bandcamp, INDEPENDENT_DB_OBJECTS from ..pages import Page, EncyclopaediaMetallum, Musify, YouTube, YoutubeMusic, Bandcamp, Musicbrainz, Genius, INDEPENDENT_DB_OBJECTS
ALL_PAGES: Set[Type[Page]] = { ALL_PAGES: Set[Type[Page]] = {
# EncyclopaediaMetallum, # EncyclopaediaMetallum,
Genius,
Musify, Musify,
YoutubeMusic, YoutubeMusic,
Bandcamp Bandcamp,
Musicbrainz
} }
if youtube_settings["use_youtube_alongside_youtube_music"]: if youtube_settings["use_youtube_alongside_youtube_music"]:
@@ -34,6 +56,13 @@ SHADY_PAGES: Set[Type[Page]] = {
Musify, Musify,
} }
fetch_map = {
Song: "fetch_song",
Album: "fetch_album",
Artist: "fetch_artist",
Label: "fetch_label",
}
if DEBUG_PAGES: if DEBUG_PAGES:
DEBUGGING_PAGE = Bandcamp DEBUGGING_PAGE = Bandcamp
print(f"Only downloading from page {DEBUGGING_PAGE}.") print(f"Only downloading from page {DEBUGGING_PAGE}.")
@@ -43,10 +72,15 @@ if DEBUG_PAGES:
class Pages: class Pages:
def __init__(self, exclude_pages: Set[Type[Page]] = None, exclude_shady: bool = False) -> None: def __init__(self, exclude_pages: Set[Type[Page]] = None, exclude_shady: bool = False, download_options: DownloadOptions = None, fetch_options: FetchOptions = None):
self.LOGGER = logging.getLogger("download")
self.download_options: DownloadOptions = download_options or DownloadOptions()
self.fetch_options: FetchOptions = fetch_options or FetchOptions()
# initialize all page instances # initialize all page instances
self._page_instances: Dict[Type[Page], Page] = dict() self._page_instances: Dict[Type[Page], Page] = dict()
self._source_to_page: Dict[SourcePages, Type[Page]] = dict() self._source_to_page: Dict[SourceType, Type[Page]] = dict()
exclude_pages = exclude_pages if exclude_pages is not None else set() exclude_pages = exclude_pages if exclude_pages is not None else set()
@@ -66,9 +100,14 @@ class Pages:
self.audio_pages: Tuple[Type[Page], ...] = _set_to_tuple(self._audio_pages_set) self.audio_pages: Tuple[Type[Page], ...] = _set_to_tuple(self._audio_pages_set)
for page_type in self.pages: for page_type in self.pages:
self._page_instances[page_type] = page_type() self._page_instances[page_type] = page_type(fetch_options=self.fetch_options, download_options=self.download_options)
self._source_to_page[page_type.SOURCE_TYPE] = page_type self._source_to_page[page_type.SOURCE_TYPE] = page_type
def _get_page_from_enum(self, source_page: SourceType) -> Page:
if source_page not in self._source_to_page:
return None
return self._page_instances[self._source_to_page[source_page]]
def search(self, query: Query) -> SearchResults: def search(self, query: Query) -> SearchResults:
result = SearchResults() result = SearchResults()
@@ -80,54 +119,211 @@ class Pages:
return result return result
def fetch_details(self, music_object: DatabaseObject, stop_at_level: int = 1) -> DatabaseObject: def fetch_details(self, data_object: DataObject, stop_at_level: int = 1, **kwargs) -> DataObject:
if not isinstance(music_object, INDEPENDENT_DB_OBJECTS): if not isinstance(data_object, INDEPENDENT_DB_OBJECTS):
return music_object return data_object
for source_page in music_object.source_collection.source_pages: source: Source
if source_page not in self._source_to_page: for source in data_object.source_collection.get_sources(source_type_sorting={
continue "only_with_page": True,
}):
new_data_object = self.fetch_from_source(source=source, stop_at_level=stop_at_level)
if new_data_object is not None:
data_object.merge(new_data_object)
page_type = self._source_to_page[source_page] return data_object
if page_type in self._pages_set: def fetch_from_source(self, source: Source, **kwargs) -> Optional[DataObject]:
music_object.merge(self._page_instances[page_type].fetch_details(music_object=music_object, stop_at_level=stop_at_level)) if not source.has_page:
return None
return music_object source_type = source.page.get_source_type(source=source)
if source_type is None:
self.LOGGER.debug(f"Could not determine source type for {source}.")
return None
def is_downloadable(self, music_object: DatabaseObject) -> bool: func = getattr(source.page, fetch_map[source_type])
_page_types = set(self._source_to_page)
for src in music_object.source_collection.source_pages:
if src in self._source_to_page:
_page_types.add(self._source_to_page[src])
audio_pages = self._audio_pages_set.intersection(_page_types) # fetching the data object and marking it as fetched
return len(audio_pages) > 0 data_object: DataObject = func(source=source, **kwargs)
data_object.mark_as_fetched(source.hash_url)
return data_object
def download(self, music_object: DatabaseObject, genre: str, download_all: bool = False, process_metadata_anyway: bool = False) -> DownloadResult: def fetch_from_url(self, url: str) -> Optional[DataObject]:
if not isinstance(music_object, INDEPENDENT_DB_OBJECTS): source = Source.match_url(url, ALL_SOURCE_TYPES.MANUAL)
return DownloadResult(error_message=f"{type(music_object).__name__} can't be downloaded.") if source is None:
return None
self.fetch_details(music_object) return self.fetch_from_source(source=source)
_page_types = set(self._source_to_page) def _skip_object(self, data_object: DataObject) -> bool:
for src in music_object.source_collection.source_pages: if isinstance(data_object, Album):
if src in self._source_to_page: if not self.download_options.download_all and data_object.album_type in self.download_options.album_type_blacklist:
_page_types.add(self._source_to_page[src]) return True
audio_pages = self._audio_pages_set.intersection(_page_types) return False
for download_page in audio_pages: def download(self, data_object: DataObject, genre: str, **kwargs) -> DownloadResult:
return self._page_instances[download_page].download(music_object=music_object, genre=genre, download_all=download_all, process_metadata_anyway=process_metadata_anyway) # fetch the given object
self.fetch_details(data_object)
output(f"\nDownloading {data_object.option_string}...", color=BColors.BOLD)
return DownloadResult(error_message=f"No audio source has been found for {music_object}.") # fetching all parent objects (e.g. if you only download a song)
if not kwargs.get("fetched_upwards", False):
to_fetch: List[DataObject] = [data_object]
def fetch_url(self, url: str, stop_at_level: int = 2) -> Tuple[Type[Page], DatabaseObject]: while len(to_fetch) > 0:
source = Source.match_url(url, SourcePages.MANUAL) new_to_fetch = []
for d in to_fetch:
if self._skip_object(d):
continue
self.fetch_details(d)
for c in d.get_parent_collections():
new_to_fetch.extend(c)
to_fetch = new_to_fetch
kwargs["fetched_upwards"] = True
# download all children
download_result: DownloadResult = DownloadResult()
for c in data_object.get_child_collections():
for d in c:
if self._skip_object(d):
continue
download_result.merge(self.download(d, genre, **kwargs))
# actually download if the object is a song
if isinstance(data_object, Song):
"""
TODO
add the traced artist and album to the naming.
I am able to do that, because duplicate values are removed later on.
"""
self._download_song(data_object, naming={
"genre": [genre],
"audio_format": [main_settings["audio_format"]],
})
return download_result
def _extract_fields_from_template(self, path_template: str) -> Set[str]:
return set(re.findall(r"{([^}]+)}", path_template))
def _parse_path_template(self, path_template: str, naming: Dict[str, List[str]]) -> str:
field_names: Set[str] = self._extract_fields_from_template(path_template)
for field in field_names:
if len(naming[field]) == 0:
raise MKMissingNameException(f"Missing field for {field}.")
path_template = path_template.replace(f"{{{field}}}", naming[field][0])
return path_template
def _download_song(self, song: Song, naming: dict) -> DownloadOptions:
"""
TODO
Search the song in the file system.
"""
r = DownloadResult(total=1)
# pre process the data recursively
song.compile()
# manage the naming
naming: Dict[str, List[str]] = defaultdict(list, naming)
naming["song"].append(song.title_value)
naming["isrc"].append(song.isrc)
naming["album"].extend(a.title_value for a in song.album_collection)
naming["album_type"].extend(a.album_type.value for a in song.album_collection)
naming["artist"].extend(a.name for a in song.artist_collection)
naming["artist"].extend(a.name for a in song.feature_artist_collection)
for a in song.album_collection:
naming["label"].extend([l.title_value for l in a.label_collection])
# removing duplicates from the naming, and process the strings
for key, value in naming.items():
# https://stackoverflow.com/a/17016257
naming[key] = list(dict.fromkeys(value))
song.genre = naming["genre"][0]
# manage the targets
tmp: Target = Target.temp(file_extension=main_settings["audio_format"])
song.target_collection.append(Target(
relative_to_music_dir=True,
file_path=Path(
self._parse_path_template(main_settings["download_path"], naming=naming),
self._parse_path_template(main_settings["download_file"], naming=naming),
)
))
for target in song.target_collection:
if target.exists:
output(f'{target.file_path} {BColors.OKGREEN.value}[already exists]', color=BColors.GREY)
r.found_on_disk += 1
if not self.download_options.download_again_if_found:
target.copy_content(tmp)
else:
target.create_path()
output(f'{target.file_path}', color=BColors.GREY)
# this streams from every available source until something succeeds, setting the skip intervals to the values of the according source
used_source: Optional[Source] = None
skip_intervals: List[Tuple[float, float]] = []
for source in song.source_collection.get_sources(source_type_sorting={
"only_with_page": True,
"sort_key": lambda page: page.download_priority,
"reverse": True,
}):
if tmp.exists:
break
used_source = source
streaming_results = source.page.download_song_to_target(source=source, target=tmp, desc="download")
skip_intervals = source.page.get_skip_intervals(song=song, source=source)
# if something has been downloaded but it somehow failed, delete the file
if streaming_results.is_fatal_error and tmp.exists:
tmp.delete()
# if everything went right, the file should exist now
if not tmp.exists:
if used_source is None:
r.error_message = f"No source found for {song.option_string}."
else:
r.error_message = f"Something went wrong downloading {song.option_string}."
return r
# post process the audio
found_on_disk = used_source is None
if not found_on_disk or self.download_options.process_audio_if_found:
correct_codec(target=tmp, skip_intervals=skip_intervals)
r.sponsor_segments = len(skip_intervals)
if used_source is not None:
used_source.page.post_process_hook(song=song, temp_target=tmp)
if not found_on_disk or self.download_options.process_metadata_if_found:
write_metadata_to_target(metadata=song.metadata, target=tmp, song=song)
# copy the tmp target to the final locations
for target in song.target_collection:
tmp.copy_content(target)
tmp.delete()
return r
def fetch_url(self, url: str, stop_at_level: int = 2) -> Tuple[Type[Page], DataObject]:
source = Source.match_url(url, ALL_SOURCE_TYPES.MANUAL)
if source is None: if source is None:
raise UrlNotFoundException(url=url) raise UrlNotFoundException(url=url)
_actual_page = self._source_to_page[source.page_enum] _actual_page = self._source_to_page[source.source_type]
return _actual_page, self._page_instances[_actual_page].fetch_object_from_source(source=source, stop_at_level=stop_at_level) return _actual_page, self._page_instances[_actual_page].fetch_object_from_source(source=source, stop_at_level=stop_at_level)

View File

@@ -2,7 +2,6 @@ from typing import Tuple, Type, Dict, List, Generator, Union
from dataclasses import dataclass from dataclasses import dataclass
from ..objects import DatabaseObject from ..objects import DatabaseObject
from ..utils.enums.source import SourcePages
from ..pages import Page, EncyclopaediaMetallum, Musify from ..pages import Page, EncyclopaediaMetallum, Musify
@@ -13,31 +12,35 @@ class Option:
class Results: class Results:
def __init__(self) -> None: def __init__(self, max_items_per_page: int = 10, **kwargs) -> None:
self._by_index: Dict[int, DatabaseObject] = dict() self._by_index: Dict[int, DatabaseObject] = dict()
self._page_by_index: Dict[int: Type[Page]] = dict() self._page_by_index: Dict[int: Type[Page]] = dict()
self.max_items_per_page = max_items_per_page
def __iter__(self) -> Generator[DatabaseObject, None, None]: def __iter__(self) -> Generator[DatabaseObject, None, None]:
for option in self.formated_generator(): for option in self.formatted_generator():
if isinstance(option, Option): if isinstance(option, Option):
yield option.music_object yield option.music_object
def formated_generator(self, max_items_per_page: int = 10) -> Generator[Union[Type[Page], Option], None, None]: def formatted_generator(self) -> Generator[Union[Type[Page], Option], None, None]:
self._by_index = dict() self._by_index = dict()
self._page_by_index = dict() self._page_by_index = dict()
def get_music_object_by_index(self, index: int) -> Tuple[Type[Page], DatabaseObject]: def __len__(self) -> int:
# if this throws a key error, either the formatted generator needs to be iterated, or the option doesn't exist. return max(self._by_index.keys())
return self._page_by_index[index], self._by_index[index]
def __getitem__(self, index: int):
return self._by_index[index]
class SearchResults(Results): class SearchResults(Results):
def __init__( def __init__(
self, self,
pages: Tuple[Type[Page], ...] = None pages: Tuple[Type[Page], ...] = None,
**kwargs,
) -> None: ) -> None:
super().__init__() super().__init__(**kwargs)
self.pages = pages or [] self.pages = pages or []
# this would initialize a list for every page, which I don't think I want # this would initialize a list for every page, which I don't think I want
@@ -55,8 +58,11 @@ class SearchResults(Results):
def get_page_results(self, page: Type[Page]) -> "PageResults": def get_page_results(self, page: Type[Page]) -> "PageResults":
return PageResults(page, self.results.get(page, [])) return PageResults(page, self.results.get(page, []))
def formated_generator(self, max_items_per_page: int = 10): def __len__(self) -> int:
super().formated_generator() return sum(min(self.max_items_per_page, len(results)) for results in self.results.values())
def formatted_generator(self):
super().formatted_generator()
i = 0 i = 0
for page in self.results: for page in self.results:
@@ -70,19 +76,37 @@ class SearchResults(Results):
i += 1 i += 1
j += 1 j += 1
if j >= max_items_per_page: if j >= self.max_items_per_page:
break break
class GoToResults(Results):
def __init__(self, results: List[DatabaseObject], **kwargs):
self.results: List[DatabaseObject] = results
super().__init__(**kwargs)
def __getitem__(self, index: int):
return self.results[index]
def __len__(self) -> int:
return len(self.results)
def formatted_generator(self):
yield from (Option(i, o) for i, o in enumerate(self.results))
class PageResults(Results): class PageResults(Results):
def __init__(self, page: Type[Page], results: List[DatabaseObject]) -> None: def __init__(self, page: Type[Page], results: List[DatabaseObject], **kwargs) -> None:
super().__init__() super().__init__(**kwargs)
self.page: Type[Page] = page self.page: Type[Page] = page
self.results: List[DatabaseObject] = results self.results: List[DatabaseObject] = results
def formated_generator(self, max_items_per_page: int = 10):
super().formated_generator() def formatted_generator(self, max_items_per_page: int = 10):
super().formatted_generator()
i = 0 i = 0
yield self.page yield self.page
@@ -92,3 +116,6 @@ class PageResults(Results):
self._by_index[i] = option self._by_index[i] = option
self._page_by_index[i] = self.page self._page_by_index[i] = self.page
i += 1 i += 1
def __len__(self) -> int:
return len(self.results)

View File

@@ -3,7 +3,7 @@ from .option import Options
from .metadata import Metadata, Mapping as ID3Mapping, ID3Timestamp from .metadata import Metadata, Mapping as ID3Mapping, ID3Timestamp
from .source import Source, SourcePages, SourceTypes from .source import Source, SourceType
from .song import ( from .song import (
Song, Song,
@@ -24,4 +24,4 @@ from .parents import OuterProxy
from .artwork import Artwork from .artwork import Artwork
DatabaseObject = TypeVar('T', bound=OuterProxy) DatabaseObject = OuterProxy

View File

@@ -53,10 +53,12 @@ class Artwork:
def get_variant_name(self, variant: ArtworkVariant) -> str: def get_variant_name(self, variant: ArtworkVariant) -> str:
return f"artwork_{variant['width']}x{variant['height']}_{hash_url(variant['url']).replace('/', '_')}" return f"artwork_{variant['width']}x{variant['height']}_{hash_url(variant['url']).replace('/', '_')}"
def __merge__(self, other: Artwork, override: bool = False) -> None: def __merge__(self, other: Artwork, **kwargs) -> None:
for key, value in other._variant_mapping.items(): for key, value in other._variant_mapping.items():
if key not in self._variant_mapping or override: if key not in self._variant_mapping:
self._variant_mapping[key] = value self._variant_mapping[key] = value
def __eq__(self, other: Artwork) -> bool: def __eq__(self, other: Artwork) -> bool:
if not isinstance(other, Artwork):
return False
return any(a == b for a, b in zip(self._variant_mapping.keys(), other._variant_mapping.keys())) return any(a == b for a, b in zip(self._variant_mapping.keys(), other._variant_mapping.keys()))

View File

@@ -1,9 +1,12 @@
from __future__ import annotations from __future__ import annotations
from collections import defaultdict from collections import defaultdict
from typing import TypeVar, Generic, Dict, Optional, Iterable, List, Iterator, Tuple, Generator, Union, Any from typing import TypeVar, Generic, Dict, Optional, Iterable, List, Iterator, Tuple, Generator, Union, Any, Set
import copy
from .parents import OuterProxy from .parents import OuterProxy
from ..utils import object_trace from ..utils import object_trace
from ..utils import output, BColors
T = TypeVar('T', bound=OuterProxy) T = TypeVar('T', bound=OuterProxy)
@@ -13,8 +16,8 @@ class Collection(Generic[T]):
_data: List[T] _data: List[T]
_indexed_values: Dict[str, set] _indexed_from_id: Dict[int, Dict[str, Any]]
_indexed_to_objects: Dict[any, list] _indexed_values: Dict[str, Dict[Any, T]]
shallow_list = property(fget=lambda self: self.data) shallow_list = property(fget=lambda self: self.data)
@@ -36,8 +39,8 @@ class Collection(Generic[T]):
self.append_object_to_attribute: Dict[str, T] = append_object_to_attribute or {} self.append_object_to_attribute: Dict[str, T] = append_object_to_attribute or {}
self.extend_object_to_attribute: Dict[str, Collection[T]] = extend_object_to_attribute or {} self.extend_object_to_attribute: Dict[str, Collection[T]] = extend_object_to_attribute or {}
self.sync_on_append: Dict[str, Collection] = sync_on_append or {} self.sync_on_append: Dict[str, Collection] = sync_on_append or {}
self.pull_from: List[Collection] = []
self._id_to_index_values: Dict[int, set] = defaultdict(set) self.push_to: List[Collection] = []
# This is to cleanly unmap previously mapped items by their id # This is to cleanly unmap previously mapped items by their id
self._indexed_from_id: Dict[int, Dict[str, Any]] = defaultdict(dict) self._indexed_from_id: Dict[int, Dict[str, Any]] = defaultdict(dict)
@@ -46,11 +49,19 @@ class Collection(Generic[T]):
self.extend(data) self.extend(data)
def __repr__(self) -> str: def __hash__(self) -> int:
return f"Collection({id(self)})" return id(self)
def _map_element(self, __object: T, from_map: bool = False): @property
self._unmap_element(__object.id) def collection_names(self) -> List[str]:
return list(set(self._collection_for.values()))
def __repr__(self) -> str:
return f"Collection({' | '.join(self.collection_names)} {id(self)})"
def _map_element(self, __object: T, no_unmap: bool = False, **kwargs):
if not no_unmap:
self._unmap_element(__object.id)
self._indexed_from_id[__object.id]["id"] = __object.id self._indexed_from_id[__object.id]["id"] = __object.id
self._indexed_values["id"][__object.id] = __object self._indexed_values["id"][__object.id] = __object
@@ -74,73 +85,128 @@ class Collection(Generic[T]):
del self._indexed_from_id[obj_id] del self._indexed_from_id[obj_id]
def _find_object(self, __object: T) -> Optional[T]: def _remap(self):
# reinitialize the mapping to clean it without time consuming operations
self._indexed_from_id: Dict[int, Dict[str, Any]] = defaultdict(dict)
self._indexed_values: Dict[str, Dict[Any, T]] = defaultdict(dict)
for e in self._data:
self._map_element(e, no_unmap=True)
def _find_object(self, __object: T, **kwargs) -> Optional[T]:
self._remap()
if __object.id in self._indexed_from_id:
return self._indexed_values["id"][__object.id]
for name, value in __object.indexing_values: for name, value in __object.indexing_values:
if value in self._indexed_values[name]: if value in self._indexed_values[name]:
return self._indexed_values[name][value] return self._indexed_values[name][value]
def append(self, __object: Optional[T], already_is_parent: bool = False, from_map: bool = False): return None
def _append_new_object(self, other: T, **kwargs):
"""
This function appends the other object to the current collection.
This only works if not another object, which represents the same real life object exists in the collection.
"""
self._data.append(other)
other._inner._is_in_collection.add(self)
for attribute, a in self.sync_on_append.items():
# syncing two collections by reference
b = other.__getattribute__(attribute)
if a is b:
continue
object_trace(f"Syncing [{a}] = [{b}]")
b_data = b.data.copy()
b_collection_for = b._collection_for.copy()
del b
for synced_with, key in b_collection_for.items():
synced_with.__setattr__(key, a)
a._collection_for[synced_with] = key
a.extend(b_data, **kwargs)
# all of the existing hooks to get the defined datastructures
for collection_attribute, generator in self.extend_object_to_attribute.items():
other.__getattribute__(collection_attribute).extend(generator, **kwargs)
for attribute, new_object in self.append_object_to_attribute.items():
other.__getattribute__(attribute).append(new_object, **kwargs)
def append(self, other: Optional[T], **kwargs):
""" """
If an object, that represents the same entity exists in a relevant collection, If an object, that represents the same entity exists in a relevant collection,
merge into this object. (and remap) merge into this object. (and remap)
Else append to this collection. Else append to this collection.
:param __object: :param other:
:param already_is_parent:
:param from_map:
:return: :return:
""" """
if __object is None: if other is None:
return
if not other._inner._has_data:
return
if other.id in self._indexed_from_id:
return return
existing_object = self._find_object(__object) object_trace(f"Appending {other.option_string} to {self}")
if existing_object is None:
# append
self._data.append(__object)
self._map_element(__object)
for collection_attribute, child_collection in self.extend_object_to_attribute.items():
__object.__getattribute__(collection_attribute).extend(child_collection)
for attribute, new_object in self.append_object_to_attribute.items():
__object.__getattribute__(attribute).append(new_object)
# only modify collections if the object actually has been appended
for attribute, a in self.sync_on_append.items():
b = __object.__getattribute__(attribute)
object_trace(f"Syncing [{a}{id(a)}] = [{b}{id(b)}]")
data_to_extend = b.data
a._collection_for.update(b._collection_for)
for synced_with, key in b._collection_for.items():
synced_with.__setattr__(key, a)
a.extend(data_to_extend)
# switching collection in the case of push to
for c in self.push_to:
r = c._find_object(other)
if r is not None:
# output("found push to", r, other, c, self, color=BColors.RED, sep="\t")
return c.append(other, **kwargs)
for c in self.pull_from:
r = c._find_object(other)
if r is not None:
# output("found pull from", r, other, c, self, color=BColors.RED, sep="\t")
c.remove(r, existing=r, **kwargs)
existing = self._find_object(other)
if existing is None:
self._append_new_object(other, **kwargs)
else: else:
# merge only if the two objects are not the same existing.merge(other, **kwargs)
if existing_object.id == __object.id:
return
old_id = existing_object.id def remove(self, *other_list: List[T], silent: bool = False, existing: Optional[T] = None, remove_from_other_collection=True, **kwargs):
other: T
for other in other_list:
existing: Optional[T] = existing or self._indexed_values["id"].get(other.id, None)
if existing is None:
if not silent:
raise ValueError(f"Object {other} not found in {self}")
return other
existing_object.merge(__object) if remove_from_other_collection:
for c in copy.copy(other._inner._is_in_collection):
c.remove(other, silent=True, remove_from_other_collection=False, **kwargs)
other._inner._is_in_collection = set()
else:
self._data.remove(existing)
self._unmap_element(existing)
if existing_object.id != old_id: def contains(self, __object: T) -> bool:
self._unmap_element(old_id) return self._find_object(__object) is not None
self._map_element(existing_object) def extend(self, other_collections: Optional[Generator[T, None, None]], **kwargs):
if other_collections is None:
def extend(self, __iterable: Optional[Generator[T, None, None]]):
if __iterable is None:
return return
for __object in __iterable: for other_object in other_collections:
self.append(__object) self.append(other_object, **kwargs)
@property @property
def data(self) -> List[T]: def data(self) -> List[T]:
@@ -156,8 +222,9 @@ class Collection(Generic[T]):
def __iter__(self) -> Iterator[T]: def __iter__(self) -> Iterator[T]:
yield from self._data yield from self._data
def __merge__(self, __other: Collection, override: bool = False): def __merge__(self, other: Collection, **kwargs):
self.extend(__other) object_trace(f"merging {str(self)} | {str(other)}")
self.extend(other, **kwargs)
def __getitem__(self, item: int): def __getitem__(self, item: int):
return self._data[item] return self._data[item]
@@ -166,3 +233,9 @@ class Collection(Generic[T]):
if item >= len(self._data): if item >= len(self._data):
return default return default
return self._data[item] return self._data[item]
def __eq__(self, other: Collection) -> bool:
if self.empty and other.empty:
return True
return self._data == other._data

View File

@@ -32,14 +32,27 @@ class FormattedText:
if self.is_empty and other.is_empty: if self.is_empty and other.is_empty:
return True return True
return self.doc == other.doc return self.html == other.html
@property @property
def markdown(self) -> str: def markdown(self) -> str:
return md(self.html).strip() return md(self.html).strip()
@markdown.setter
def markdown(self, value: str) -> None:
self.html = mistune.markdown(value)
@property
def plain(self) -> str:
md = self.markdown
return md.replace("\n\n", "\n")
@plain.setter
def plain(self, value: str) -> None:
self.html = mistune.markdown(plain_to_markdown(value))
def __str__(self) -> str: def __str__(self) -> str:
return self.markdown return self.markdown
plaintext = markdown plaintext = plain

View File

@@ -34,6 +34,6 @@ class Lyrics(OuterProxy):
@property @property
def metadata(self) -> Metadata: def metadata(self) -> Metadata:
return Metadata({ return Metadata({
id3Mapping.UNSYNCED_LYRICS: [self.text.markdown] id3Mapping.UNSYNCED_LYRICS: [self.text.plaintext]
}) })

View File

@@ -92,7 +92,7 @@ class Mapping(Enum):
key = attribute.value key = attribute.value
if key[0] == 'T': if key[0] == 'T':
# a text fiel # a text field
return cls.get_text_instance(key, value) return cls.get_text_instance(key, value)
if key[0] == "W": if key[0] == "W":
# an url field # an url field
@@ -355,7 +355,12 @@ class Metadata:
return None return None
list_data = self.id3_dict[field] list_data = self.id3_dict[field]
#correct duplications
correct_list_data = list()
for data in list_data:
if data not in correct_list_data:
correct_list_data.append(data)
list_data = correct_list_data
# convert for example the time objects to timestamps # convert for example the time objects to timestamps
for i, element in enumerate(list_data): for i, element in enumerate(list_data):
# for performances sake I don't do other checks if it is already the right type # for performances sake I don't do other checks if it is already the right type
@@ -395,6 +400,5 @@ class Metadata:
""" """
# set the tagging timestamp to the current time # set the tagging timestamp to the current time
self.__setitem__(Mapping.TAGGING_TIME, [ID3Timestamp.now()]) self.__setitem__(Mapping.TAGGING_TIME, [ID3Timestamp.now()])
for field in self.id3_dict: for field in self.id3_dict:
yield self.get_mutagen_object(field) yield self.get_mutagen_object(field)

View File

@@ -8,10 +8,11 @@ from typing import Optional, Dict, Tuple, List, Type, Generic, Any, TypeVar, Set
from pathlib import Path from pathlib import Path
import inspect import inspect
from .source import SourceCollection
from .metadata import Metadata from .metadata import Metadata
from ..utils import get_unix_time, object_trace from ..utils import get_unix_time, object_trace, generate_id
from ..utils.config import logging_settings, main_settings from ..utils.config import logging_settings, main_settings
from ..utils.shared import HIGHEST_ID from ..utils.shared import HIGHEST_ID, DEBUG_PRINT_ID
from ..utils.hacking import MetaClass from ..utils.hacking import MetaClass
LOGGER = logging_settings["object_logger"] LOGGER = logging_settings["object_logger"]
@@ -29,9 +30,17 @@ class InnerData:
""" """
_refers_to_instances: set = None _refers_to_instances: set = None
_is_in_collection: set = None
_has_data: bool = False
"""
Attribute versions keep track, of if the attribute has been changed.
"""
def __init__(self, object_type, **kwargs): def __init__(self, object_type, **kwargs):
self._refers_to_instances = set() self._refers_to_instances = set()
self._is_in_collection = set()
self._fetched_from: dict = {} self._fetched_from: dict = {}
# initialize the default values # initialize the default values
@@ -42,21 +51,39 @@ class InnerData:
for key, value in kwargs.items(): for key, value in kwargs.items():
if hasattr(value, "__is_collection__"): if hasattr(value, "__is_collection__"):
value._collection_for[self] = key value._collection_for[self] = key
self.__setattr__(key, value) self.__setattr__(key, value)
if self._has_data:
continue
def __setattr__(self, key: str, value):
if self._has_data or not hasattr(self, "_default_values"):
return super().__setattr__(key, value)
super().__setattr__("_has_data", not (key in self._default_values and self._default_values[key] == value))
return super().__setattr__(key, value)
def __hash__(self): def __hash__(self):
return self.id return self.id
def __merge__(self, __other: InnerData, override: bool = False): def __merge__(self, __other: InnerData, **kwargs):
""" """
:param __other: :param __other:
:param override:
:return: :return:
""" """
self._fetched_from.update(__other._fetched_from) self._fetched_from.update(__other._fetched_from)
self._is_in_collection.update(__other._is_in_collection)
for key, value in __other.__dict__.copy().items(): for key, value in __other.__dict__.copy().items():
if key.startswith("_"):
continue
if hasattr(value, "__is_collection__") and key in self.__dict__:
self.__getattribute__(key).__merge__(value, **kwargs)
continue
# just set the other value if self doesn't already have it # just set the other value if self doesn't already have it
if key not in self.__dict__ or (key in self.__dict__ and self.__dict__[key] == self._default_values.get(key)): if key not in self.__dict__ or (key in self.__dict__ and self.__dict__[key] == self._default_values.get(key)):
self.__setattr__(key, value) self.__setattr__(key, value)
@@ -64,13 +91,8 @@ class InnerData:
# if the object of value implemented __merge__, it merges # if the object of value implemented __merge__, it merges
existing = self.__getattribute__(key) existing = self.__getattribute__(key)
if hasattr(type(existing), "__merge__"): if hasattr(existing, "__merge__"):
existing.__merge__(value, override) existing.__merge__(value, **kwargs)
continue
# override the existing value if requested
if override:
self.__setattr__(key, value)
class OuterProxy: class OuterProxy:
@@ -78,14 +100,14 @@ class OuterProxy:
Wraps the inner data, and provides apis, to naturally access those values. Wraps the inner data, and provides apis, to naturally access those values.
""" """
_default_factories: dict = {} source_collection: SourceCollection
_default_factories: dict = {"source_collection": SourceCollection}
_outer_attribute: Set[str] = {"options", "metadata", "indexing_values", "option_string"} _outer_attribute: Set[str] = {"options", "metadata", "indexing_values", "option_string"}
DOWNWARDS_COLLECTION_STRING_ATTRIBUTES = tuple() DOWNWARDS_COLLECTION_STRING_ATTRIBUTES = tuple()
UPWARDS_COLLECTION_STRING_ATTRIBUTES = tuple() UPWARDS_COLLECTION_STRING_ATTRIBUTES = tuple()
TITEL = "id"
def __init__(self, _id: int = None, dynamic: bool = False, **kwargs): def __init__(self, _id: int = None, dynamic: bool = False, **kwargs):
_automatic_id: bool = False _automatic_id: bool = False
@@ -94,7 +116,7 @@ class OuterProxy:
generates a random integer id generates a random integer id
the range is defined in the config the range is defined in the config
""" """
_id = random.randint(0, HIGHEST_ID) _id = generate_id()
_automatic_id = True _automatic_id = True
kwargs["automatic_id"] = _automatic_id kwargs["automatic_id"] = _automatic_id
@@ -116,7 +138,7 @@ class OuterProxy:
self._inner: InnerData = InnerData(type(self), **kwargs) self._inner: InnerData = InnerData(type(self), **kwargs)
self._inner._refers_to_instances.add(self) self._inner._refers_to_instances.add(self)
object_trace(f"creating {type(self).__name__} [{self.title_string}]") object_trace(f"creating {type(self).__name__} [{self.option_string}]")
self.__init_collections__() self.__init_collections__()
@@ -173,18 +195,18 @@ class OuterProxy:
def __eq__(self, other: Any): def __eq__(self, other: Any):
return self.__hash__() == other.__hash__() return self.__hash__() == other.__hash__()
def merge(self, __other: Optional[OuterProxy], override: bool = False): def merge(self, __other: Optional[OuterProxy], **kwargs):
""" """
1. merges the data of __other in self 1. merges the data of __other in self
2. replaces the data of __other with the data of self 2. replaces the data of __other with the data of self
:param __other: :param __other:
:param override:
:return: :return:
""" """
if __other is None: if __other is None:
return return
a_id = self.id
a = self a = self
b = __other b = __other
@@ -196,7 +218,7 @@ class OuterProxy:
if len(b._inner._refers_to_instances) > len(a._inner._refers_to_instances): if len(b._inner._refers_to_instances) > len(a._inner._refers_to_instances):
a, b = b, a a, b = b, a
object_trace(f"merging {type(a).__name__} [{a.title_string} | {a.id}] with {type(b).__name__} [{b.title_string} | {b.id}]") object_trace(f"merging {a.option_string} | {b.option_string}")
old_inner = b._inner old_inner = b._inner
@@ -204,11 +226,13 @@ class OuterProxy:
instance._inner = a._inner instance._inner = a._inner
a._inner._refers_to_instances.add(instance) a._inner._refers_to_instances.add(instance)
a._inner.__merge__(old_inner, override=override) a._inner.__merge__(old_inner, **kwargs)
del old_inner del old_inner
def __merge__(self, __other: Optional[OuterProxy], override: bool = False): self.id = a_id
self.merge(__other, override)
def __merge__(self, __other: Optional[OuterProxy], **kwargs):
self.merge(__other, **kwargs)
def mark_as_fetched(self, *url_hash_list: List[str]): def mark_as_fetched(self, *url_hash_list: List[str]):
for url_hash in url_hash_list: for url_hash in url_hash_list:
@@ -235,7 +259,23 @@ class OuterProxy:
@property @property
def options(self) -> List[P]: def options(self) -> List[P]:
return [self] r = []
for collection_string_attribute in self.UPWARDS_COLLECTION_STRING_ATTRIBUTES:
r.extend(self.__getattribute__(collection_string_attribute))
r.append(self)
for collection_string_attribute in self.DOWNWARDS_COLLECTION_STRING_ATTRIBUTES:
r.extend(self.__getattribute__(collection_string_attribute))
return r
@property
def option_string(self) -> str:
return self.title_string
INDEX_DEPENDS_ON: List[str] = []
@property @property
def indexing_values(self) -> List[Tuple[str, object]]: def indexing_values(self) -> List[Tuple[str, object]]:
@@ -267,9 +307,49 @@ class OuterProxy:
return r return r
@property
def root_collections(self) -> List[Collection]:
if len(self.UPWARDS_COLLECTION_STRING_ATTRIBUTES) == 0:
return [self]
r = []
for collection_string_attribute in self.UPWARDS_COLLECTION_STRING_ATTRIBUTES:
r.extend(self.__getattribute__(collection_string_attribute))
return r
def _compile(self, **kwargs):
pass
def compile(self, from_root=False, **kwargs):
# compile from the root
if not from_root:
for c in self.root_collections:
c.compile(from_root=True, **kwargs)
return
self._compile(**kwargs)
for c_attribute in self.DOWNWARDS_COLLECTION_STRING_ATTRIBUTES:
for c in self.__getattribute__(c_attribute):
c.compile(from_root=True, **kwargs)
TITEL = "id"
@property @property
def title_string(self) -> str: def title_string(self) -> str:
return str(self.__getattribute__(self.TITEL)) + (f" {self.id}" if DEBUG_PRINT_ID else "")
@property
def title_value(self) -> str:
return str(self.__getattribute__(self.TITEL)) return str(self.__getattribute__(self.TITEL))
def __repr__(self): def __repr__(self):
return f"{type(self).__name__}({self.title_string})" return f"{type(self).__name__}({self.title_string})"
def get_child_collections(self):
for collection_string_attribute in self.DOWNWARDS_COLLECTION_STRING_ATTRIBUTES:
yield self.__getattribute__(collection_string_attribute)
def get_parent_collections(self):
for collection_string_attribute in self.UPWARDS_COLLECTION_STRING_ATTRIBUTES:
yield self.__getattribute__(collection_string_attribute)

View File

@@ -3,6 +3,7 @@ from __future__ import annotations
import random import random
from collections import defaultdict from collections import defaultdict
from typing import List, Optional, Dict, Tuple, Type, Union from typing import List, Optional, Dict, Tuple, Type, Union
import copy
import pycountry import pycountry
@@ -22,6 +23,7 @@ from .parents import OuterProxy, P
from .source import Source, SourceCollection from .source import Source, SourceCollection
from .target import Target from .target import Target
from .country import Language, Country from .country import Language, Country
from ..utils.shared import DEBUG_PRINT_ID
from ..utils.string_processing import unify from ..utils.string_processing import unify
from .parents import OuterProxy as Base from .parents import OuterProxy as Base
@@ -43,7 +45,8 @@ def get_collection_string(
template: str, template: str,
ignore_titles: Set[str] = None, ignore_titles: Set[str] = None,
background: BColors = OPTION_BACKGROUND, background: BColors = OPTION_BACKGROUND,
foreground: BColors = OPTION_FOREGROUND foreground: BColors = OPTION_FOREGROUND,
add_id: bool = DEBUG_PRINT_ID,
) -> str: ) -> str:
if collection.empty: if collection.empty:
return "" return ""
@@ -55,8 +58,15 @@ def get_collection_string(
r = background r = background
def get_element_str(element) -> str:
nonlocal add_id
r = element.title_string.strip()
if add_id and False:
r += " " + str(element.id)
return r
element: Base element: Base
titel_list: List[str] = [element.title_string.strip() for element in collection if element.title_string not in ignore_titles] titel_list: List[str] = [get_element_str(element) for element in collection if element.title_string not in ignore_titles]
for i, titel in enumerate(titel_list): for i, titel in enumerate(titel_list):
delimiter = ", " delimiter = ", "
@@ -85,7 +95,7 @@ class Song(Base):
target_collection: Collection[Target] target_collection: Collection[Target]
lyrics_collection: Collection[Lyrics] lyrics_collection: Collection[Lyrics]
main_artist_collection: Collection[Artist] artist_collection: Collection[Artist]
feature_artist_collection: Collection[Artist] feature_artist_collection: Collection[Artist]
album_collection: Collection[Album] album_collection: Collection[Album]
@@ -97,11 +107,11 @@ class Song(Base):
"lyrics_collection": Collection, "lyrics_collection": Collection,
"artwork": Artwork, "artwork": Artwork,
"main_artist_collection": Collection,
"album_collection": Collection, "album_collection": Collection,
"artist_collection": Collection,
"feature_artist_collection": Collection, "feature_artist_collection": Collection,
"title": lambda: "", "title": lambda: None,
"unified_title": lambda: None, "unified_title": lambda: None,
"isrc": lambda: None, "isrc": lambda: None,
"genre": lambda: None, "genre": lambda: None,
@@ -109,30 +119,47 @@ class Song(Base):
"tracksort": lambda: 0, "tracksort": lambda: 0,
} }
def __init__(self, title: str = "", unified_title: str = None, isrc: str = None, length: int = None, def __init__(
genre: str = None, note: FormattedText = None, source_list: List[Source] = None, self,
target_list: List[Target] = None, lyrics_list: List[Lyrics] = None, title: str = None,
main_artist_list: List[Artist] = None, feature_artist_list: List[Artist] = None, isrc: str = None,
album_list: List[Album] = None, tracksort: int = 0, artwork: Optional[Artwork] = None, **kwargs) -> None: length: int = None,
genre: str = None,
note: FormattedText = None,
source_list: List[Source] = None,
target_list: List[Target] = None,
lyrics_list: List[Lyrics] = None,
artist_list: List[Artist] = None,
feature_artist_list: List[Artist] = None,
album_list: List[Album] = None,
tracksort: int = 0,
artwork: Optional[Artwork] = None,
**kwargs
) -> None:
real_kwargs = copy.copy(locals())
real_kwargs.update(real_kwargs.pop("kwargs", {}))
Base.__init__(**locals()) Base.__init__(**real_kwargs)
UPWARDS_COLLECTION_STRING_ATTRIBUTES = ("album_collection", "main_artist_collection", "feature_artist_collection") UPWARDS_COLLECTION_STRING_ATTRIBUTES = ("artist_collection", "feature_artist_collection", "album_collection")
TITEL = "title" TITEL = "title"
def __init_collections__(self) -> None: def __init_collections__(self) -> None:
self.feature_artist_collection.push_to = [self.artist_collection]
self.artist_collection.pull_from = [self.feature_artist_collection]
self.album_collection.sync_on_append = { self.album_collection.sync_on_append = {
"artist_collection": self.main_artist_collection, "artist_collection": self.artist_collection,
} }
self.album_collection.append_object_to_attribute = { self.album_collection.append_object_to_attribute = {
"song_collection": self, "song_collection": self,
} }
self.main_artist_collection.extend_object_to_attribute = { self.artist_collection.extend_object_to_attribute = {
"main_album_collection": self.album_collection "album_collection": self.album_collection
} }
self.feature_artist_collection.append_object_to_attribute = { self.feature_artist_collection.extend_object_to_attribute = {
"feature_song_collection": self "album_collection": self.album_collection
} }
def _add_other_db_objects(self, object_type: Type[OuterProxy], object_list: List[OuterProxy]): def _add_other_db_objects(self, object_type: Type[OuterProxy], object_list: List[OuterProxy]):
@@ -144,20 +171,21 @@ class Song(Base):
return return
if isinstance(object_list, Artist): if isinstance(object_list, Artist):
self.main_artist_collection.extend(object_list) self.feature_artist_collection.extend(object_list)
return return
if isinstance(object_list, Album): if isinstance(object_list, Album):
self.album_collection.extend(object_list) self.album_collection.extend(object_list)
return return
INDEX_DEPENDS_ON = ("title", "isrc", "source_collection")
@property @property
def indexing_values(self) -> List[Tuple[str, object]]: def indexing_values(self) -> List[Tuple[str, object]]:
return [ return [
('id', self.id),
('title', unify(self.title)), ('title', unify(self.title)),
('isrc', self.isrc), ('isrc', self.isrc),
*[('url', source.url) for source in self.source_collection] *self.source_collection.indexing_values(),
] ]
@property @property
@@ -169,18 +197,20 @@ class Song(Base):
id3Mapping.GENRE: [self.genre], id3Mapping.GENRE: [self.genre],
id3Mapping.TRACKNUMBER: [self.tracksort_str], id3Mapping.TRACKNUMBER: [self.tracksort_str],
id3Mapping.COMMENT: [self.note.markdown], id3Mapping.COMMENT: [self.note.markdown],
id3Mapping.FILE_WEBPAGE_URL: self.source_collection.url_list,
id3Mapping.SOURCE_WEBPAGE_URL: self.source_collection.homepage_list,
}) })
# metadata.merge_many([s.get_song_metadata() for s in self.source_collection]) album sources have no relevant metadata for id3 # metadata.merge_many([s.get_song_metadata() for s in self.source_collection]) album sources have no relevant metadata for id3
metadata.merge_many([a.metadata for a in self.album_collection]) metadata.merge_many([a.metadata for a in self.album_collection])
metadata.merge_many([a.metadata for a in self.main_artist_collection]) metadata.merge_many([a.metadata for a in self.artist_collection])
metadata.merge_many([a.metadata for a in self.feature_artist_collection]) metadata.merge_many([a.metadata for a in self.feature_artist_collection])
metadata.merge_many([lyrics.metadata for lyrics in self.lyrics_collection]) metadata.merge_many([lyrics.metadata for lyrics in self.lyrics_collection])
return metadata return metadata
def get_artist_credits(self) -> str: def get_artist_credits(self) -> str:
main_artists = ", ".join([artist.name for artist in self.main_artist_collection]) main_artists = ", ".join([artist.name for artist in self.artist_collection])
feature_artists = ", ".join([artist.name for artist in self.feature_artist_collection]) feature_artists = ", ".join([artist.name for artist in self.feature_artist_collection])
if len(feature_artists) == 0: if len(feature_artists) == 0:
@@ -189,20 +219,13 @@ class Song(Base):
@property @property
def option_string(self) -> str: def option_string(self) -> str:
r = OPTION_FOREGROUND.value + self.title + BColors.ENDC.value + OPTION_BACKGROUND.value r = "song "
r += OPTION_FOREGROUND.value + self.title_string + BColors.ENDC.value + OPTION_BACKGROUND.value
r += get_collection_string(self.album_collection, " from {}", ignore_titles={self.title}) r += get_collection_string(self.album_collection, " from {}", ignore_titles={self.title})
r += get_collection_string(self.main_artist_collection, " by {}") r += get_collection_string(self.artist_collection, " by {}")
r += get_collection_string(self.feature_artist_collection, " feat. {}") r += get_collection_string(self.feature_artist_collection, " feat. {}" if len(self.artist_collection) > 0 else " by {}")
return r return r
@property
def options(self) -> List[P]:
options = self.main_artist_collection.shallow_list
options.extend(self.feature_artist_collection)
options.extend(self.album_collection)
options.append(self)
return options
@property @property
def tracksort_str(self) -> str: def tracksort_str(self) -> str:
""" """
@@ -215,11 +238,6 @@ class Song(Base):
return f"{self.tracksort}/{len(self.album_collection[0].song_collection) or 1}" return f"{self.tracksort}/{len(self.album_collection[0].song_collection) or 1}"
"""
All objects dependent on Album
"""
class Album(Base): class Album(Base):
title: str title: str
unified_title: str unified_title: str
@@ -233,8 +251,9 @@ class Album(Base):
source_collection: SourceCollection source_collection: SourceCollection
artist_collection: Collection[Artist]
song_collection: Collection[Song] song_collection: Collection[Song]
artist_collection: Collection[Artist]
feature_artist_collection: Collection[Artist]
label_collection: Collection[Label] label_collection: Collection[Label]
_default_factories = { _default_factories = {
@@ -250,37 +269,54 @@ class Album(Base):
"notes": FormattedText, "notes": FormattedText,
"source_collection": SourceCollection, "source_collection": SourceCollection,
"artist_collection": Collection,
"song_collection": Collection, "song_collection": Collection,
"artist_collection": Collection,
"feature_artist_collection": Collection,
"label_collection": Collection, "label_collection": Collection,
} }
TITEL = "title" TITEL = "title"
# This is automatically generated # This is automatically generated
def __init__(self, title: str = None, unified_title: str = None, album_status: AlbumStatus = None, def __init__(
album_type: AlbumType = None, language: Language = None, date: ID3Timestamp = None, self,
barcode: str = None, albumsort: int = None, notes: FormattedText = None, title: str = None,
source_list: List[Source] = None, artist_list: List[Artist] = None, song_list: List[Song] = None, unified_title: str = None,
label_list: List[Label] = None, **kwargs) -> None: album_status: AlbumStatus = None,
super().__init__(title=title, unified_title=unified_title, album_status=album_status, album_type=album_type, album_type: AlbumType = None,
language=language, date=date, barcode=barcode, albumsort=albumsort, notes=notes, language: Language = None,
source_list=source_list, artist_list=artist_list, song_list=song_list, label_list=label_list, date: ID3Timestamp = None,
**kwargs) barcode: str = None,
albumsort: int = None,
notes: FormattedText = None,
source_list: List[Source] = None,
artist_list: List[Artist] = None,
song_list: List[Song] = None,
label_list: List[Label] = None,
**kwargs
) -> None:
real_kwargs = copy.copy(locals())
real_kwargs.update(real_kwargs.pop("kwargs", {}))
Base.__init__(**real_kwargs)
DOWNWARDS_COLLECTION_STRING_ATTRIBUTES = ("song_collection",) DOWNWARDS_COLLECTION_STRING_ATTRIBUTES = ("song_collection",)
UPWARDS_COLLECTION_STRING_ATTRIBUTES = ("artist_collection", "label_collection") UPWARDS_COLLECTION_STRING_ATTRIBUTES = ("label_collection", "artist_collection")
def __init_collections__(self): def __init_collections__(self):
self.feature_artist_collection.push_to = [self.artist_collection]
self.artist_collection.pull_from = [self.feature_artist_collection]
self.song_collection.append_object_to_attribute = { self.song_collection.append_object_to_attribute = {
"album_collection": self "album_collection": self
} }
self.song_collection.sync_on_append = { self.song_collection.sync_on_append = {
"main_artist_collection": self.artist_collection "artist_collection": self.artist_collection
} }
self.artist_collection.append_object_to_attribute = { self.artist_collection.append_object_to_attribute = {
"main_album_collection": self "album_collection": self
} }
self.artist_collection.extend_object_to_attribute = { self.artist_collection.extend_object_to_attribute = {
"label_collection": self.label_collection "label_collection": self.label_collection
@@ -302,13 +338,14 @@ class Album(Base):
self.label_collection.extend(object_list) self.label_collection.extend(object_list)
return return
INDEX_DEPENDS_ON = ("title", "barcode", "source_collection")
@property @property
def indexing_values(self) -> List[Tuple[str, object]]: def indexing_values(self) -> List[Tuple[str, object]]:
return [ return [
('id', self.id),
('title', unify(self.title)), ('title', unify(self.title)),
('barcode', self.barcode), ('barcode', self.barcode),
*[('url', source.url) for source in self.source_collection] *self.source_collection.indexing_values(),
] ]
@property @property
@@ -333,19 +370,36 @@ class Album(Base):
@property @property
def option_string(self) -> str: def option_string(self) -> str:
r = OPTION_FOREGROUND.value + self.title + BColors.ENDC.value + OPTION_BACKGROUND.value r = "album "
r += OPTION_FOREGROUND.value + self.title_string + BColors.ENDC.value + OPTION_BACKGROUND.value
r += get_collection_string(self.artist_collection, " by {}") r += get_collection_string(self.artist_collection, " by {}")
if len(self.artist_collection) <= 0:
r += get_collection_string(self.feature_artist_collection, " by {}")
r += get_collection_string(self.label_collection, " under {}") r += get_collection_string(self.label_collection, " under {}")
if len(self.song_collection) > 0: if len(self.song_collection) > 0:
r += f" with {len(self.song_collection)} songs" r += f" with {len(self.song_collection)} songs"
return r return r
@property def _compile(self):
def options(self) -> List[P]: self.analyze_implied_album_type()
options = [*self.artist_collection, self, *self.song_collection] self.update_tracksort()
self.fix_artist_collection()
return options def analyze_implied_album_type(self):
# if the song collection has only one song, it is reasonable to assume that it is a single
if len(self.song_collection) == 1:
self.album_type = AlbumType.SINGLE
return
# if the album already has an album type, we don't need to do anything
if self.album_type is not AlbumType.OTHER:
return
# for information on EP's I looked at https://www.reddit.com/r/WeAreTheMusicMakers/comments/a354ql/whats_the_cutoff_length_between_ep_and_album/
if len(self.song_collection) < 9:
self.album_type = AlbumType.EP
return
def update_tracksort(self): def update_tracksort(self):
""" """
@@ -372,17 +426,15 @@ class Album(Base):
tracksort_map[i] = existing_list.pop(0) tracksort_map[i] = existing_list.pop(0)
tracksort_map[i].tracksort = i tracksort_map[i].tracksort = i
def compile(self, merge_into: bool = False): def fix_artist_collection(self):
""" """
compiles the recursive structures, I add artists, that could only be feature artists to the feature artist collection.
and does depending on the object some other stuff. They get automatically moved to main artist collection, if a matching artist exists in the main artist collection or is appended to it later on.
If I am not sure for any artist, I try to analyze the most common artist in the song collection of one album.
no need to override if only the recursive structure should be built.
override self.build_recursive_structures() instead
""" """
self.update_tracksort() # move all artists that are in all feature_artist_collections, of every song, to the artist_collection
self._build_recursive_structures(build_version=random.randint(0, 99999), merge=merge_into) pass
@property @property
def copyright(self) -> str: def copyright(self) -> str:
@@ -415,34 +467,26 @@ class Album(Base):
return self.album_type.value return self.album_type.value
"""
All objects dependent on Artist
"""
class Artist(Base): class Artist(Base):
name: str name: str
unified_name: str
country: Country country: Country
formed_in: ID3Timestamp formed_in: ID3Timestamp
notes: FormattedText notes: FormattedText
lyrical_themes: List[str] lyrical_themes: List[str]
general_genre: str general_genre: str
unformated_location: str unformatted_location: str
source_collection: SourceCollection source_collection: SourceCollection
contact_collection: Collection[Contact] contact_collection: Collection[Contact]
feature_song_collection: Collection[Song] album_collection: Collection[Album]
main_album_collection: Collection[Album]
label_collection: Collection[Label] label_collection: Collection[Label]
_default_factories = { _default_factories = {
"name": str, "name": lambda: None,
"unified_name": lambda: None,
"country": lambda: None, "country": lambda: None,
"unformated_location": lambda: None, "unformatted_location": lambda: None,
"formed_in": ID3Timestamp, "formed_in": ID3Timestamp,
"notes": FormattedText, "notes": FormattedText,
@@ -450,8 +494,7 @@ class Artist(Base):
"general_genre": lambda: "", "general_genre": lambda: "",
"source_collection": SourceCollection, "source_collection": SourceCollection,
"feature_song_collection": Collection, "album_collection": Collection,
"main_album_collection": Collection,
"contact_collection": Collection, "contact_collection": Collection,
"label_collection": Collection, "label_collection": Collection,
} }
@@ -459,30 +502,37 @@ class Artist(Base):
TITEL = "name" TITEL = "name"
# This is automatically generated # This is automatically generated
def __init__(self, name: str = "", unified_name: str = None, country: Country = None, def __init__(
formed_in: ID3Timestamp = None, notes: FormattedText = None, lyrical_themes: List[str] = None, self,
general_genre: str = None, unformated_location: str = None, source_list: List[Source] = None, name: str = None,
contact_list: List[Contact] = None, feature_song_list: List[Song] = None, unified_name: str = None,
main_album_list: List[Album] = None, label_list: List[Label] = None, **kwargs) -> None: country: Country = None,
formed_in: ID3Timestamp = None,
notes: FormattedText = None,
lyrical_themes: List[str] = None,
general_genre: str = None,
unformatted_location: str = None,
source_list: List[Source] = None,
contact_list: List[Contact] = None,
feature_song_list: List[Song] = None,
album_list: List[Album] = None,
label_list: List[Label] = None,
**kwargs
) -> None:
real_kwargs = copy.copy(locals())
real_kwargs.update(real_kwargs.pop("kwargs", {}))
super().__init__(name=name, unified_name=unified_name, country=country, formed_in=formed_in, notes=notes, Base.__init__(**real_kwargs)
lyrical_themes=lyrical_themes, general_genre=general_genre,
unformated_location=unformated_location, source_list=source_list, contact_list=contact_list,
feature_song_list=feature_song_list, main_album_list=main_album_list, label_list=label_list,
**kwargs)
DOWNWARDS_COLLECTION_STRING_ATTRIBUTES = ("feature_song_collection", "main_album_collection")
DOWNWARDS_COLLECTION_STRING_ATTRIBUTES = ("album_collection",)
UPWARDS_COLLECTION_STRING_ATTRIBUTES = ("label_collection",) UPWARDS_COLLECTION_STRING_ATTRIBUTES = ("label_collection",)
def __init_collections__(self): def __init_collections__(self):
self.feature_song_collection.append_object_to_attribute = { self.album_collection.append_object_to_attribute = {
"feature_artist_collection": self "feature_artist_collection": self
} }
self.main_album_collection.append_object_to_attribute = {
"artist_collection": self
}
self.label_collection.append_object_to_attribute = { self.label_collection.append_object_to_attribute = {
"current_artist_collection": self "current_artist_collection": self
} }
@@ -490,39 +540,32 @@ class Artist(Base):
def _add_other_db_objects(self, object_type: Type[OuterProxy], object_list: List[OuterProxy]): def _add_other_db_objects(self, object_type: Type[OuterProxy], object_list: List[OuterProxy]):
if object_type is Song: if object_type is Song:
# this doesn't really make sense # this doesn't really make sense
# self.feature_song_collection.extend(object_list)
return return
if object_type is Artist: if object_type is Artist:
return return
if object_type is Album: if object_type is Album:
self.main_album_collection.extend(object_list) self.album_collection.extend(object_list)
return return
if object_type is Label: if object_type is Label:
self.label_collection.extend(object_list) self.label_collection.extend(object_list)
return return
@property def _compile(self):
def options(self) -> List[P]: self.update_albumsort()
options = [self, *self.main_album_collection.shallow_list, *self.feature_album]
print(options)
return options
def update_albumsort(self): def update_albumsort(self):
""" """
This updates the albumsort attributes, of the albums in This updates the albumsort attributes, of the albums in
`self.main_album_collection`, and sorts the albums, if possible. `self.album_collection`, and sorts the albums, if possible.
It is advised to only call this function, once all the albums are It is advised to only call this function, once all the albums are
added to the artist. added to the artist.
:return: :return:
""" """
if len(self.main_album_collection) <= 0:
return
type_section: Dict[AlbumType, int] = defaultdict(lambda: 2, { type_section: Dict[AlbumType, int] = defaultdict(lambda: 2, {
AlbumType.OTHER: 0, # if I don't know it, I add it to the first section AlbumType.OTHER: 0, # if I don't know it, I add it to the first section
AlbumType.STUDIO_ALBUM: 0, AlbumType.STUDIO_ALBUM: 0,
@@ -534,7 +577,7 @@ class Artist(Base):
# order albums in the previously defined section # order albums in the previously defined section
album: Album album: Album
for album in self.main_album_collection: for album in self.album_collection:
sections[type_section[album.album_type]].append(album) sections[type_section[album.album_type]].append(album)
def sort_section(_section: List[Album], last_albumsort: int) -> int: def sort_section(_section: List[Album], last_albumsort: int) -> int:
@@ -565,96 +608,40 @@ class Artist(Base):
album_list.extend(sections[section_index]) album_list.extend(sections[section_index])
# replace the old collection with the new one # replace the old collection with the new one
self.main_album_collection: Collection = Collection(data=album_list, element_type=Album) self.album_collection._data = album_list
INDEX_DEPENDS_ON = ("name", "source_collection", "contact_collection")
@property @property
def indexing_values(self) -> List[Tuple[str, object]]: def indexing_values(self) -> List[Tuple[str, object]]:
return [ return [
('id', self.id),
('name', unify(self.name)), ('name', unify(self.name)),
*[('url', source.url) for source in self.source_collection], *[('contact', contact.value) for contact in self.contact_collection],
*[('contact', contact.value) for contact in self.contact_collection] *self.source_collection.indexing_values(),
] ]
@property @property
def metadata(self) -> Metadata: def metadata(self) -> Metadata:
metadata = Metadata({ metadata = Metadata({
id3Mapping.ARTIST: [self.name] id3Mapping.ARTIST: [self.name],
id3Mapping.ARTIST_WEBPAGE_URL: self.source_collection.url_list,
}) })
metadata.merge_many([s.get_artist_metadata() for s in self.source_collection])
return metadata return metadata
"""
def __str__(self, include_notes: bool = False):
string = self.name or ""
if include_notes:
plaintext_notes = self.notes.get_plaintext()
if plaintext_notes is not None:
string += "\n" + plaintext_notes
return string
"""
def __repr__(self):
return f"Artist(\"{self.name}\")"
@property @property
def option_string(self) -> str: def option_string(self) -> str:
r = OPTION_FOREGROUND.value + self.name + BColors.ENDC.value + OPTION_BACKGROUND.value r = "artist "
r += OPTION_FOREGROUND.value + self.title_string + BColors.ENDC.value + OPTION_BACKGROUND.value
r += get_collection_string(self.label_collection, " under {}") r += get_collection_string(self.label_collection, " under {}")
r += OPTION_BACKGROUND.value r += OPTION_BACKGROUND.value
if len(self.main_album_collection) > 0: if len(self.album_collection) > 0:
r += f" with {len(self.main_album_collection)} albums" r += f" with {len(self.album_collection)} albums"
if len(self.feature_song_collection) > 0:
r += f" featured in {len(self.feature_song_collection)} songs"
r += BColors.ENDC.value r += BColors.ENDC.value
return r return r
@property
def options(self) -> List[P]:
options = [self]
options.extend(self.main_album_collection)
options.extend(self.feature_song_collection)
return options
@property
def feature_album(self) -> Album:
return Album(
title="features",
album_status=AlbumStatus.UNRELEASED,
album_type=AlbumType.COMPILATION_ALBUM,
is_split=True,
albumsort=666,
dynamic=True,
song_list=self.feature_song_collection.shallow_list
)
def get_all_songs(self) -> List[Song]:
"""
returns a list of all Songs.
probably not that useful, because it is unsorted
"""
collection = self.feature_song_collection.copy()
for album in self.discography:
collection.extend(album.song_collection)
return collection
@property
def discography(self) -> List[Album]:
flat_copy_discography = self.main_album_collection.copy()
flat_copy_discography.append(self.feature_album)
return flat_copy_discography
"""
Label
"""
class Label(Base): class Label(Base):
COLLECTION_STRING_ATTRIBUTES = ("album_collection", "current_artist_collection") COLLECTION_STRING_ATTRIBUTES = ("album_collection", "current_artist_collection")
@@ -683,12 +670,21 @@ class Label(Base):
TITEL = "name" TITEL = "name"
def __init__(self, name: str = None, unified_name: str = None, notes: FormattedText = None, def __init__(
source_list: List[Source] = None, contact_list: List[Contact] = None, self,
album_list: List[Album] = None, current_artist_list: List[Artist] = None, **kwargs) -> None: name: str = None,
super().__init__(name=name, unified_name=unified_name, notes=notes, source_list=source_list, unified_name: str = None,
contact_list=contact_list, album_list=album_list, current_artist_list=current_artist_list, notes: FormattedText = None,
**kwargs) source_list: List[Source] = None,
contact_list: List[Contact] = None,
album_list: List[Album] = None,
current_artist_list: List[Artist] = None,
**kwargs
) -> None:
real_kwargs = copy.copy(locals())
real_kwargs.update(real_kwargs.pop("kwargs", {}))
Base.__init__(**real_kwargs)
def __init_collections__(self): def __init_collections__(self):
self.album_collection.append_object_to_attribute = { self.album_collection.append_object_to_attribute = {
@@ -702,7 +698,6 @@ class Label(Base):
@property @property
def indexing_values(self) -> List[Tuple[str, object]]: def indexing_values(self) -> List[Tuple[str, object]]:
return [ return [
('id', self.id),
('name', unify(self.name)), ('name', unify(self.name)),
*[('url', source.url) for source in self.source_collection] *[('url', source.url) for source in self.source_collection]
] ]
@@ -729,4 +724,4 @@ class Label(Base):
@property @property
def option_string(self): def option_string(self):
return OPTION_FOREGROUND.value + self.name + BColors.ENDC.value return "label " + OPTION_FOREGROUND.value + self.name + BColors.ENDC.value

View File

@@ -2,142 +2,237 @@ from __future__ import annotations
from collections import defaultdict from collections import defaultdict
from enum import Enum from enum import Enum
from typing import List, Dict, Set, Tuple, Optional, Iterable from typing import (
from urllib.parse import urlparse List,
Dict,
Set,
Tuple,
Optional,
Iterable,
Generator,
TypedDict,
Callable,
Any,
TYPE_CHECKING
)
from urllib.parse import urlparse, ParseResult
from dataclasses import dataclass, field
from functools import cached_property
from ..utils.enums.source import SourcePages, SourceTypes from ..utils import generate_id
from ..utils.enums import SourceType, ALL_SOURCE_TYPES
from ..utils.config import youtube_settings from ..utils.config import youtube_settings
from ..utils.string_processing import hash_url from ..utils.string_processing import hash_url, shorten_display_url
from .metadata import Mapping, Metadata from .metadata import Mapping, Metadata
from .parents import OuterProxy if TYPE_CHECKING:
from .collection import Collection from ..pages.abstract import Page
class Source(OuterProxy):
@dataclass
class Source:
source_type: SourceType
url: str url: str
referrer_page: SourceType = None
audio_url: Optional[str] = None
page_enum: SourcePages additional_data: dict = field(default_factory=dict)
referer_page: SourcePages
audio_url: str def __post_init__(self):
self.referrer_page = self.referrer_page or self.source_type
_default_factories = {
"audio_url": lambda: None,
}
# This is automatically generated
def __init__(self, page_enum: SourcePages, url: str, referer_page: SourcePages = None, audio_url: str = None,
**kwargs) -> None:
if referer_page is None:
referer_page = page_enum
super().__init__(url=url, page_enum=page_enum, referer_page=referer_page, audio_url=audio_url, **kwargs)
@classmethod @classmethod
def match_url(cls, url: str, referer_page: SourcePages) -> Optional["Source"]: def match_url(cls, url: str, referrer_page: SourceType) -> Optional[Source]:
""" """
this shouldn't be used, unlesse you are not certain what the source is for this shouldn't be used, unless you are not certain what the source is for
the reason is that it is more inefficient the reason is that it is more inefficient
""" """
parsed = urlparse(url) parsed_url = urlparse(url)
url = parsed.geturl() url = parsed_url.geturl()
if "musify" in parsed.netloc: if "musify" in parsed_url.netloc:
return cls(SourcePages.MUSIFY, url, referer_page=referer_page) return cls(ALL_SOURCE_TYPES.MUSIFY, url, referrer_page=referrer_page)
if parsed.netloc in [_url.netloc for _url in youtube_settings['youtube_url']]: if parsed_url.netloc in [_url.netloc for _url in youtube_settings['youtube_url']]:
return cls(SourcePages.YOUTUBE, url, referer_page=referer_page) return cls(ALL_SOURCE_TYPES.YOUTUBE, url, referrer_page=referrer_page)
if url.startswith("https://www.deezer"): if url.startswith("https://www.deezer"):
return cls(SourcePages.DEEZER, url, referer_page=referer_page) return cls(ALL_SOURCE_TYPES.DEEZER, url, referrer_page=referrer_page)
if url.startswith("https://open.spotify.com"): if url.startswith("https://open.spotify.com"):
return cls(SourcePages.SPOTIFY, url, referer_page=referer_page) return cls(ALL_SOURCE_TYPES.SPOTIFY, url, referrer_page=referrer_page)
if "bandcamp" in url: if "bandcamp" in url:
return cls(SourcePages.BANDCAMP, url, referer_page=referer_page) return cls(ALL_SOURCE_TYPES.BANDCAMP, url, referrer_page=referrer_page)
if "wikipedia" in parsed.netloc: if "wikipedia" in parsed_url.netloc:
return cls(SourcePages.WIKIPEDIA, url, referer_page=referer_page) return cls(ALL_SOURCE_TYPES.WIKIPEDIA, url, referrer_page=referrer_page)
if url.startswith("https://www.metal-archives.com/"): if url.startswith("https://www.metal-archives.com/"):
return cls(SourcePages.ENCYCLOPAEDIA_METALLUM, url, referer_page=referer_page) return cls(ALL_SOURCE_TYPES.ENCYCLOPAEDIA_METALLUM, url, referrer_page=referrer_page)
# the less important once # the less important once
if url.startswith("https://www.facebook"): if url.startswith("https://www.facebook"):
return cls(SourcePages.FACEBOOK, url, referer_page=referer_page) return cls(ALL_SOURCE_TYPES.FACEBOOK, url, referrer_page=referrer_page)
if url.startswith("https://www.instagram"): if url.startswith("https://www.instagram"):
return cls(SourcePages.INSTAGRAM, url, referer_page=referer_page) return cls(ALL_SOURCE_TYPES.INSTAGRAM, url, referrer_page=referrer_page)
if url.startswith("https://twitter"): if url.startswith("https://twitter"):
return cls(SourcePages.TWITTER, url, referer_page=referer_page) return cls(ALL_SOURCE_TYPES.TWITTER, url, referrer_page=referrer_page)
if url.startswith("https://myspace.com"): if url.startswith("https://myspace.com"):
return cls(SourcePages.MYSPACE, url, referer_page=referer_page) return cls(ALL_SOURCE_TYPES.MYSPACE, url, referrer_page=referrer_page)
def get_song_metadata(self) -> Metadata: @property
return Metadata({ def has_page(self) -> bool:
Mapping.FILE_WEBPAGE_URL: [self.url], return self.source_type.page is not None
Mapping.SOURCE_WEBPAGE_URL: [self.homepage]
})
def get_artist_metadata(self) -> Metadata: @property
return Metadata({ def page(self) -> Page:
Mapping.ARTIST_WEBPAGE_URL: [self.url] return self.source_type.page
})
@property
def parsed_url(self) -> ParseResult:
return urlparse(self.url)
@property @property
def hash_url(self) -> str: def hash_url(self) -> str:
return hash_url(self.url) return hash_url(self.url)
@property @property
def metadata(self) -> Metadata: def indexing_values(self) -> list:
return self.get_song_metadata() r = [hash_url(self.url)]
if self.audio_url:
@property r.append(hash_url(self.audio_url))
def indexing_values(self) -> List[Tuple[str, object]]: return r
return [
('id', self.id),
('url', self.url),
('audio_url', self.audio_url),
]
def __str__(self):
return self.__repr__()
def __repr__(self) -> str: def __repr__(self) -> str:
return f"Src({self.page_enum.value}: {self.url}, {self.audio_url})" return f"Src({self.source_type.value}: {shorten_display_url(self.url)})"
@property def __merge__(self, other: Source, **kwargs):
def title_string(self) -> str: if self.audio_url is None:
return self.url self.audio_url = other.audio_url
self.additional_data.update(other.additional_data)
page_str = property(fget=lambda self: self.page_enum.value) page_str = property(fget=lambda self: self.source_type.value)
type_str = property(fget=lambda self: self.type_enum.value)
homepage = property(fget=lambda self: SourcePages.get_homepage(self.page_enum))
class SourceCollection(Collection): class SourceTypeSorting(TypedDict):
sort_key: Callable[[SourceType], Any]
reverse: bool
only_with_page: bool
class SourceCollection:
__change_version__ = generate_id()
_indexed_sources: Dict[str, Source]
_sources_by_type: Dict[SourceType, List[Source]]
def __init__(self, data: Optional[Iterable[Source]] = None, **kwargs): def __init__(self, data: Optional[Iterable[Source]] = None, **kwargs):
self._page_to_source_list: Dict[SourcePages, List[Source]] = defaultdict(list) self._sources_by_type = defaultdict(list)
self._indexed_sources = {}
super().__init__(data=data, **kwargs) self.extend(data or [])
def _map_element(self, __object: Source, **kwargs): def source_types(
super()._map_element(__object, **kwargs) self,
only_with_page: bool = False,
sort_key = lambda page: page.name,
reverse: bool = False
) -> Iterable[SourceType]:
"""
Returns a list of all source types contained in this source collection.
self._page_to_source_list[__object.page_enum].append(__object) Args:
only_with_page (bool, optional): If True, only returns source types that have a page, meaning you can download from them.
sort_key (function, optional): A function that defines the sorting key for the source types. Defaults to lambda page: page.name.
reverse (bool, optional): If True, sorts the source types in reverse order. Defaults to False.
Returns:
Iterable[SourceType]: A list of source types.
"""
source_types: List[SourceType] = self._sources_by_type.keys()
if only_with_page:
source_types = filter(lambda st: st.has_page, source_types)
return sorted(
source_types,
key=sort_key,
reverse=reverse
)
def get_sources(self, *source_types: List[SourceType], source_type_sorting: SourceTypeSorting = None) -> Generator[Source]:
"""
Retrieves sources based on the provided source types and source type sorting.
Args:
*source_types (List[Source]): Variable number of source types to filter the sources.
source_type_sorting (SourceTypeSorting): Sorting criteria for the source types. This is only relevant if no source types are provided.
Yields:
Generator[Source]: A generator that yields the sources based on the provided filters.
Returns:
None
"""
if not len(source_types):
source_type_sorting = source_type_sorting or {}
source_types = self.source_types(**source_type_sorting)
for source_type in source_types:
yield from self._sources_by_type[source_type]
def append(self, source: Source):
if source is None:
return
existing_source = None
for key in source.indexing_values:
if key in self._indexed_sources:
existing_source = self._indexed_sources[key]
break
if existing_source is not None:
existing_source.__merge__(source)
source = existing_source
else:
self._sources_by_type[source.source_type].append(source)
changed = False
for key in source.indexing_values:
if key not in self._indexed_sources:
changed = True
self._indexed_sources[key] = source
if changed:
self.__change_version__ = generate_id()
def extend(self, sources: Iterable[Source]):
for source in sources:
self.append(source)
def __iter__(self):
yield from self.get_sources()
def __merge__(self, other: SourceCollection, **kwargs):
self.extend(other)
@property @property
def source_pages(self) -> Set[SourcePages]: def hash_url_list(self) -> List[str]:
return set(source.page_enum for source in self._data) return [hash_url(source.url) for source in self.get_sources()]
def get_sources_from_page(self, source_page: SourcePages) -> List[Source]: @property
""" def url_list(self) -> List[str]:
getting the sources for a specific page like return [source.url for source in self.get_sources()]
YouTube or musify
""" @property
return self._page_to_source_list[source_page].copy() def homepage_list(self) -> List[str]:
return [source_type.homepage for source_type in self._sources_by_type.keys()]
def indexing_values(self) -> Generator[Tuple[str, str], None, None]:
for index in self._indexed_sources:
yield "url", index

View File

@@ -1,7 +1,7 @@
from __future__ import annotations from __future__ import annotations
from pathlib import Path from pathlib import Path
from typing import List, Tuple, TextIO, Union from typing import List, Tuple, TextIO, Union, Optional
import logging import logging
import random import random
import requests import requests
@@ -31,7 +31,10 @@ class Target(OuterProxy):
} }
@classmethod @classmethod
def temp(cls, name: str = str(random.randint(0, HIGHEST_ID))) -> P: def temp(cls, name: str = str(random.randint(0, HIGHEST_ID)), file_extension: Optional[str] = None) -> P:
if file_extension is not None:
name = f"{name}.{file_extension}"
return cls(main_settings["temp_directory"] / name) return cls(main_settings["temp_directory"] / name)
# This is automatically generated # This is automatically generated

View File

@@ -1,7 +1,9 @@
from .encyclopaedia_metallum import EncyclopaediaMetallum from .encyclopaedia_metallum import EncyclopaediaMetallum
from .musify import Musify from .musify import Musify
from .musicbrainz import Musicbrainz
from .youtube import YouTube from .youtube import YouTube
from .youtube_music import YoutubeMusic from .youtube_music import YoutubeMusic
from .bandcamp import Bandcamp from .bandcamp import Bandcamp
from .genius import Genius
from .abstract import Page, INDEPENDENT_DB_OBJECTS from .abstract import Page, INDEPENDENT_DB_OBJECTS

View File

@@ -3,8 +3,9 @@ import random
import re import re
from copy import copy from copy import copy
from pathlib import Path from pathlib import Path
from typing import Optional, Union, Type, Dict, Set, List, Tuple from typing import Optional, Union, Type, Dict, Set, List, Tuple, TypedDict
from string import Formatter from string import Formatter
from dataclasses import dataclass, field
import requests import requests
from bs4 import BeautifulSoup from bs4 import BeautifulSoup
@@ -21,131 +22,45 @@ from ..objects import (
Collection, Collection,
Label, Label,
) )
from ..utils.enums.source import SourcePages from ..utils.enums import SourceType
from ..utils.enums.album import AlbumType from ..utils.enums.album import AlbumType
from ..audio import write_metadata_to_target, correct_codec from ..audio import write_metadata_to_target, correct_codec
from ..utils.config import main_settings from ..utils.config import main_settings
from ..utils.support_classes.query import Query from ..utils.support_classes.query import Query
from ..utils.support_classes.download_result import DownloadResult from ..utils.support_classes.download_result import DownloadResult
from ..utils.string_processing import fit_to_file_system from ..utils.string_processing import fit_to_file_system
from ..utils import trace from ..utils import trace, output, BColors
INDEPENDENT_DB_OBJECTS = Union[Label, Album, Artist, Song] INDEPENDENT_DB_OBJECTS = Union[Label, Album, Artist, Song]
INDEPENDENT_DB_TYPES = Union[Type[Song], Type[Album], Type[Artist], Type[Label]] INDEPENDENT_DB_TYPES = Union[Type[Song], Type[Album], Type[Artist], Type[Label]]
@dataclass
class FetchOptions:
download_all: bool = False
album_type_blacklist: Set[AlbumType] = field(default_factory=lambda: set(AlbumType(a) for a in main_settings["album_type_blacklist"]))
class NamingDict(dict): @dataclass
CUSTOM_KEYS: Dict[str, str] = { class DownloadOptions:
"label": "label.name", download_all: bool = False
"artist": "artist.name", album_type_blacklist: Set[AlbumType] = field(default_factory=lambda: set(AlbumType(a) for a in main_settings["album_type_blacklist"]))
"song": "song.title",
"isrc": "song.isrc",
"album": "album.title",
"album_type": "album.album_type_string"
}
def __init__(self, values: dict, object_mappings: Dict[str, DatabaseObject] = None):
self.object_mappings: Dict[str, DatabaseObject] = object_mappings or dict()
super().__init__(values)
self["audio_format"] = main_settings["audio_format"]
def add_object(self, music_object: DatabaseObject):
self.object_mappings[type(music_object).__name__.lower()] = music_object
def copy(self) -> dict:
return type(self)(super().copy(), self.object_mappings.copy())
def __getitem__(self, key: str) -> str:
return fit_to_file_system(super().__getitem__(key))
def default_value_for_name(self, name: str) -> str:
return f'Various {name.replace("_", " ").title()}'
def __missing__(self, key: str) -> str:
if "." not in key:
if key not in self.CUSTOM_KEYS:
return self.default_value_for_name(key)
key = self.CUSTOM_KEYS[key]
frag_list = key.split(".")
object_name = frag_list[0].strip().lower()
attribute_name = frag_list[-1].strip().lower()
if object_name not in self.object_mappings:
return self.default_value_for_name(attribute_name)
music_object = self.object_mappings[object_name]
try:
value = getattr(music_object, attribute_name)
if value is None:
return self.default_value_for_name(attribute_name)
return str(value)
except AttributeError:
return self.default_value_for_name(attribute_name)
def _clean_music_object(music_object: INDEPENDENT_DB_OBJECTS, collections: Dict[INDEPENDENT_DB_TYPES, Collection]):
if type(music_object) == Label:
return _clean_label(label=music_object, collections=collections)
if type(music_object) == Artist:
return _clean_artist(artist=music_object, collections=collections)
if type(music_object) == Album:
return _clean_album(album=music_object, collections=collections)
if type(music_object) == Song:
return _clean_song(song=music_object, collections=collections)
def _clean_collection(collection: Collection, collection_dict: Dict[INDEPENDENT_DB_TYPES, Collection]):
if collection.element_type not in collection_dict:
return
for i, element in enumerate(collection):
r = collection_dict[collection.element_type].append(element, merge_into_existing=True)
collection[i] = r.current_element
if not r.was_the_same:
_clean_music_object(r.current_element, collection_dict)
def _clean_label(label: Label, collections: Dict[INDEPENDENT_DB_TYPES, Collection]):
_clean_collection(label.current_artist_collection, collections)
_clean_collection(label.album_collection, collections)
def _clean_artist(artist: Artist, collections: Dict[INDEPENDENT_DB_TYPES, Collection]):
_clean_collection(artist.main_album_collection, collections)
_clean_collection(artist.feature_song_collection, collections)
_clean_collection(artist.label_collection, collections)
def _clean_album(album: Album, collections: Dict[INDEPENDENT_DB_TYPES, Collection]):
_clean_collection(album.label_collection, collections)
_clean_collection(album.song_collection, collections)
_clean_collection(album.artist_collection, collections)
def _clean_song(song: Song, collections: Dict[INDEPENDENT_DB_TYPES, Collection]):
_clean_collection(song.album_collection, collections)
_clean_collection(song.feature_artist_collection, collections)
_clean_collection(song.main_artist_collection, collections)
process_audio_if_found: bool = False
process_metadata_if_found: bool = True
class Page: class Page:
""" SOURCE_TYPE: SourceType
This is an abstract class, laying out the LOGGER: logging.Logger
functionality for every other class fetching something
"""
SOURCE_TYPE: SourcePages def __new__(cls, *args, **kwargs):
LOGGER = logging.getLogger("this shouldn't be used") cls.LOGGER = logging.getLogger(cls.__name__)
# set this to true, if all song details can also be fetched by fetching album details return super().__new__(cls)
NO_ADDITIONAL_DATA_FROM_SONG = False
def __init__(self, download_options: DownloadOptions = None, fetch_options: FetchOptions = None):
self.SOURCE_TYPE.register_page(self)
self.download_options: DownloadOptions = download_options or DownloadOptions()
self.fetch_options: FetchOptions = fetch_options or FetchOptions()
def _search_regex(self, pattern, string, default=None, fatal=True, flags=0, group=None): def _search_regex(self, pattern, string, default=None, fatal=True, flags=0, group=None):
""" """
@@ -218,106 +133,7 @@ class Page:
def song_search(self, song: Song) -> List[Song]: def song_search(self, song: Song) -> List[Song]:
return [] return []
def fetch_details( # to fetch stuff
self,
music_object: DatabaseObject,
stop_at_level: int = 1,
post_process: bool = True
) -> DatabaseObject:
"""
when a music object with lacking data is passed in, it returns
the SAME object **(no copy)** with more detailed data.
If you for example put in, an album, it fetches the tracklist
:param music_object:
:param stop_at_level:
This says the depth of the level the scraper will recurse to.
If this is for example set to 2, then the levels could be:
1. Level: the album
2. Level: every song of the album + every artist of the album
If no additional requests are needed to get the data one level below the supposed stop level
this gets ignored
:return detailed_music_object: IT MODIFIES THE INPUT OBJ
"""
# creating a new object, of the same type
new_music_object: Optional[DatabaseObject] = None
fetched_from_url: List[str] = []
# only certain database objects, have a source list
if isinstance(music_object, INDEPENDENT_DB_OBJECTS):
source: Source
for source in music_object.source_collection.get_sources_from_page(self.SOURCE_TYPE):
if music_object.already_fetched_from(source.hash_url):
continue
tmp = self.fetch_object_from_source(
source=source,
enforce_type=type(music_object),
stop_at_level=stop_at_level,
post_process=False,
type_string=type(music_object).__name__,
entity_string=music_object.option_string,
)
if new_music_object is None:
new_music_object = tmp
else:
new_music_object.merge(tmp)
fetched_from_url.append(source.hash_url)
if new_music_object is not None:
music_object.merge(new_music_object)
music_object.mark_as_fetched(*fetched_from_url)
return music_object
def fetch_object_from_source(
self,
source: Source,
stop_at_level: int = 2,
enforce_type: Type[DatabaseObject] = None,
post_process: bool = True,
type_string: str = "",
entity_string: str = "",
) -> Optional[DatabaseObject]:
obj_type = self.get_source_type(source)
if obj_type is None:
return None
if enforce_type != obj_type and enforce_type is not None:
self.LOGGER.warning(f"Object type isn't type to enforce: {enforce_type}, {obj_type}")
return None
music_object: DatabaseObject = None
fetch_map = {
Song: self.fetch_song,
Album: self.fetch_album,
Artist: self.fetch_artist,
Label: self.fetch_label
}
if obj_type in fetch_map:
music_object = fetch_map[obj_type](source, stop_at_level)
else:
self.LOGGER.warning(f"Can't fetch details of type: {obj_type}")
return None
if stop_at_level > 0:
trace(f"fetching {type_string} [{entity_string}] [stop_at_level={stop_at_level}]")
collection: Collection
for collection_str in music_object.DOWNWARDS_COLLECTION_STRING_ATTRIBUTES:
collection = music_object.__getattribute__(collection_str)
for sub_element in collection:
sub_element.merge(
self.fetch_details(sub_element, stop_at_level=stop_at_level - 1, post_process=False))
return music_object
def fetch_song(self, source: Source, stop_at_level: int = 1) -> Song: def fetch_song(self, source: Source, stop_at_level: int = 1) -> Song:
return Song() return Song()
@@ -330,155 +146,7 @@ class Page:
def fetch_label(self, source: Source, stop_at_level: int = 1) -> Label: def fetch_label(self, source: Source, stop_at_level: int = 1) -> Label:
return Label() return Label()
def download( # to download stuff
self,
music_object: DatabaseObject,
genre: str,
download_all: bool = False,
process_metadata_anyway: bool = True
) -> DownloadResult:
naming_dict: NamingDict = NamingDict({"genre": genre})
def fill_naming_objects(naming_music_object: DatabaseObject):
nonlocal naming_dict
for collection_name in naming_music_object.UPWARDS_COLLECTION_STRING_ATTRIBUTES:
collection: Collection = getattr(naming_music_object, collection_name)
if collection.empty:
continue
dom_ordered_music_object: DatabaseObject = collection[0]
naming_dict.add_object(dom_ordered_music_object)
return fill_naming_objects(dom_ordered_music_object)
fill_naming_objects(music_object)
return self._download(music_object, naming_dict, download_all, process_metadata_anyway=process_metadata_anyway)
def _download(
self,
music_object: DatabaseObject,
naming_dict: NamingDict,
download_all: bool = False,
skip_details: bool = False,
process_metadata_anyway: bool = True
) -> DownloadResult:
trace(f"downloading {type(music_object).__name__} [{music_object.option_string}]")
skip_next_details = skip_details
# Skips all releases, that are defined in shared.ALBUM_TYPE_BLACKLIST, if download_all is False
if isinstance(music_object, Album):
if self.NO_ADDITIONAL_DATA_FROM_SONG:
skip_next_details = True
if not download_all and music_object.album_type.value in main_settings["album_type_blacklist"]:
return DownloadResult()
if not (isinstance(music_object, Song) and self.NO_ADDITIONAL_DATA_FROM_SONG):
self.fetch_details(music_object=music_object, stop_at_level=1)
if isinstance(music_object, Album):
music_object.update_tracksort()
naming_dict.add_object(music_object)
if isinstance(music_object, Song):
return self._download_song(music_object, naming_dict, process_metadata_anyway=process_metadata_anyway)
download_result: DownloadResult = DownloadResult()
for collection_name in music_object.DOWNWARDS_COLLECTION_STRING_ATTRIBUTES:
collection: Collection = getattr(music_object, collection_name)
sub_ordered_music_object: DatabaseObject
for sub_ordered_music_object in collection:
download_result.merge(self._download(sub_ordered_music_object, naming_dict.copy(), download_all,
skip_details=skip_next_details,
process_metadata_anyway=process_metadata_anyway))
return download_result
def _download_song(self, song: Song, naming_dict: NamingDict, process_metadata_anyway: bool = True):
if "genre" not in naming_dict and song.genre is not None:
naming_dict["genre"] = song.genre
if song.genre is None:
song.genre = naming_dict["genre"]
path_parts = Formatter().parse(main_settings["download_path"])
file_parts = Formatter().parse(main_settings["download_file"])
new_target = Target(
relative_to_music_dir=True,
file_path=Path(
main_settings["download_path"].format(**{part[1]: naming_dict[part[1]] for part in path_parts}),
main_settings["download_file"].format(**{part[1]: naming_dict[part[1]] for part in file_parts})
)
)
if song.target_collection.empty:
song.target_collection.append(new_target)
sources = song.source_collection.get_sources_from_page(self.SOURCE_TYPE)
if len(sources) == 0:
return DownloadResult(error_message=f"No source found for {song.title} as {self.__class__.__name__}.")
temp_target: Target = Target(
relative_to_music_dir=False,
file_path=Path(
main_settings["temp_directory"],
str(song.id)
)
)
r = DownloadResult(1)
found_on_disc = False
target: Target
for target in song.target_collection:
if target.exists:
if process_metadata_anyway:
target.copy_content(temp_target)
found_on_disc = True
r.found_on_disk += 1
r.add_target(target)
if found_on_disc and not process_metadata_anyway:
self.LOGGER.info(f"{song.option_string} already exists, thus not downloading again.")
return r
source = sources[0]
if not found_on_disc:
r = self.download_song_to_target(source=source, target=temp_target, desc=song.option_string)
if not r.is_fatal_error:
r.merge(self._post_process_targets(song, temp_target,
[] if found_on_disc else self.get_skip_intervals(song, source)))
return r
def _post_process_targets(self, song: Song, temp_target: Target, interval_list: List) -> DownloadResult:
correct_codec(temp_target, interval_list=interval_list)
self.post_process_hook(song, temp_target)
write_metadata_to_target(song.metadata, temp_target, song)
r = DownloadResult()
target: Target
for target in song.target_collection:
if temp_target is not target:
temp_target.copy_content(target)
r.add_target(target)
temp_target.delete()
r.sponsor_segments += len(interval_list)
return r
def get_skip_intervals(self, song: Song, source: Source) -> List[Tuple[float, float]]: def get_skip_intervals(self, song: Song, source: Source) -> List[Tuple[float, float]]:
return [] return []

View File

@@ -10,7 +10,7 @@ from .abstract import Page
from ..objects import ( from ..objects import (
Artist, Artist,
Source, Source,
SourcePages, SourceType,
Song, Song,
Album, Album,
Label, Label,
@@ -22,6 +22,8 @@ from ..objects import (
Artwork, Artwork,
) )
from ..connection import Connection from ..connection import Connection
from ..utils import dump_to_file
from ..utils.enums import SourceType, ALL_SOURCE_TYPES
from ..utils.support_classes.download_result import DownloadResult from ..utils.support_classes.download_result import DownloadResult
from ..utils.string_processing import clean_song_title from ..utils.string_processing import clean_song_title
from ..utils.config import main_settings, logging_settings from ..utils.config import main_settings, logging_settings
@@ -48,9 +50,7 @@ class BandcampTypes(Enum):
class Bandcamp(Page): class Bandcamp(Page):
# CHANGE SOURCE_TYPE = ALL_SOURCE_TYPES.BANDCAMP
SOURCE_TYPE = SourcePages.BANDCAMP
LOGGER = logging_settings["bandcamp_logger"]
def __init__(self, *args, **kwargs): def __init__(self, *args, **kwargs):
self.connection: Connection = Connection( self.connection: Connection = Connection(
@@ -62,8 +62,7 @@ class Bandcamp(Page):
super().__init__(*args, **kwargs) super().__init__(*args, **kwargs)
def get_source_type(self, source: Source) -> Optional[Type[DatabaseObject]]: def get_source_type(self, source: Source) -> Optional[Type[DatabaseObject]]:
parsed_url = urlparse(source.url) path = source.parsed_url.path.replace("/", "")
path = parsed_url.path.replace("/", "")
if path == "" or path.startswith("music"): if path == "" or path.startswith("music"):
return Artist return Artist
@@ -118,7 +117,7 @@ class Bandcamp(Page):
return Song( return Song(
title=clean_song_title(name, artist_name=data["band_name"]), title=clean_song_title(name, artist_name=data["band_name"]),
source_list=source_list, source_list=source_list,
main_artist_list=[ artist_list=[
Artist( Artist(
name=data["band_name"], name=data["band_name"],
source_list=[ source_list=[
@@ -185,7 +184,7 @@ class Bandcamp(Page):
if li is None and li['href'] is not None: if li is None and li['href'] is not None:
continue continue
source_list.append(Source.match_url(_parse_artist_url(li['href']), referer_page=self.SOURCE_TYPE)) source_list.append(Source.match_url(_parse_artist_url(li['href']), referrer_page=self.SOURCE_TYPE))
return Artist( return Artist(
name=name, name=name,
@@ -238,7 +237,7 @@ class Bandcamp(Page):
html_music_grid = soup.find("ol", {"id": "music-grid"}) html_music_grid = soup.find("ol", {"id": "music-grid"})
if html_music_grid is not None: if html_music_grid is not None:
for subsoup in html_music_grid.find_all("li"): for subsoup in html_music_grid.find_all("li"):
artist.main_album_collection.append(self._parse_album(soup=subsoup, initial_source=source)) artist.album_collection.append(self._parse_album(soup=subsoup, initial_source=source))
for i, data_blob_soup in enumerate(soup.find_all("div", {"id": ["pagedata", "collectors-data"]})): for i, data_blob_soup in enumerate(soup.find_all("div", {"id": ["pagedata", "collectors-data"]})):
data_blob = data_blob_soup["data-blob"] data_blob = data_blob_soup["data-blob"]
@@ -247,7 +246,7 @@ class Bandcamp(Page):
dump_to_file(f"bandcamp_artist_data_blob_{i}.json", data_blob, is_json=True, exit_after_dump=False) dump_to_file(f"bandcamp_artist_data_blob_{i}.json", data_blob, is_json=True, exit_after_dump=False)
if data_blob is not None: if data_blob is not None:
artist.main_album_collection.extend( artist.album_collection.extend(
self._parse_artist_data_blob(json.loads(data_blob), source.url) self._parse_artist_data_blob(json.loads(data_blob), source.url)
) )
@@ -371,7 +370,7 @@ class Bandcamp(Page):
date=ID3Timestamp.strptime(data["datePublished"], "%d %b %Y %H:%M:%S %Z"), date=ID3Timestamp.strptime(data["datePublished"], "%d %b %Y %H:%M:%S %Z"),
source_list=[Source(self.SOURCE_TYPE, album_data["@id"])] source_list=[Source(self.SOURCE_TYPE, album_data["@id"])]
)], )],
main_artist_list=[Artist( artist_list=[Artist(
name=artist_data["name"].strip(), name=artist_data["name"].strip(),
source_list=[Source(self.SOURCE_TYPE, _parse_artist_url(artist_data["@id"]))] source_list=[Source(self.SOURCE_TYPE, _parse_artist_url(artist_data["@id"]))]
)], )],

View File

@@ -7,7 +7,7 @@ from urllib.parse import urlparse, urlencode
from ..connection import Connection from ..connection import Connection
from ..utils.config import logging_settings from ..utils.config import logging_settings
from .abstract import Page from .abstract import Page
from ..utils.enums.source import SourcePages from ..utils.enums import SourceType, ALL_SOURCE_TYPES
from ..utils.enums.album import AlbumType from ..utils.enums.album import AlbumType
from ..utils.support_classes.query import Query from ..utils.support_classes.query import Query
from ..objects import ( from ..objects import (
@@ -52,14 +52,14 @@ def _song_from_json(artist_html=None, album_html=None, release_type=None, title=
return Song( return Song(
title=title, title=title,
main_artist_list=[ artist_list=[
_artist_from_json(artist_html=artist_html) _artist_from_json(artist_html=artist_html)
], ],
album_list=[ album_list=[
_album_from_json(album_html=album_html, release_type=release_type, artist_html=artist_html) _album_from_json(album_html=album_html, release_type=release_type, artist_html=artist_html)
], ],
source_list=[ source_list=[
Source(SourcePages.ENCYCLOPAEDIA_METALLUM, song_id) Source(ALL_SOURCE_TYPES.ENCYCLOPAEDIA_METALLUM, song_id)
] ]
) )
@@ -85,7 +85,7 @@ def _artist_from_json(artist_html=None, genre=None, country=None) -> Artist:
return Artist( return Artist(
name=artist_name, name=artist_name,
source_list=[ source_list=[
Source(SourcePages.ENCYCLOPAEDIA_METALLUM, artist_url) Source(ALL_SOURCE_TYPES.ENCYCLOPAEDIA_METALLUM, artist_url)
] ]
) )
@@ -105,7 +105,7 @@ def _album_from_json(album_html=None, release_type=None, artist_html=None) -> Al
title=album_name, title=album_name,
album_type=album_type, album_type=album_type,
source_list=[ source_list=[
Source(SourcePages.ENCYCLOPAEDIA_METALLUM, album_url) Source(ALL_SOURCE_TYPES.ENCYCLOPAEDIA_METALLUM, album_url)
], ],
artist_list=[ artist_list=[
_artist_from_json(artist_html=artist_html) _artist_from_json(artist_html=artist_html)
@@ -207,7 +207,7 @@ def create_grid(
class EncyclopaediaMetallum(Page): class EncyclopaediaMetallum(Page):
SOURCE_TYPE = SourcePages.ENCYCLOPAEDIA_METALLUM SOURCE_TYPE = ALL_SOURCE_TYPES.ENCYCLOPAEDIA_METALLUM
LOGGER = logging_settings["metal_archives_logger"] LOGGER = logging_settings["metal_archives_logger"]
def __init__(self, **kwargs): def __init__(self, **kwargs):
@@ -266,7 +266,7 @@ class EncyclopaediaMetallum(Page):
song_title = song.title.strip() song_title = song.title.strip()
album_titles = ["*"] if song.album_collection.empty else [album.title.strip() for album in song.album_collection] album_titles = ["*"] if song.album_collection.empty else [album.title.strip() for album in song.album_collection]
artist_titles = ["*"] if song.main_artist_collection.empty else [artist.name.strip() for artist in song.main_artist_collection] artist_titles = ["*"] if song.artist_collection.empty else [artist.name.strip() for artist in song.artist_collection]
search_results = [] search_results = []
@@ -486,7 +486,7 @@ class EncyclopaediaMetallum(Page):
href = anchor["href"] href = anchor["href"]
if href is not None: if href is not None:
source_list.append(Source.match_url(href, referer_page=self.SOURCE_TYPE)) source_list.append(Source.match_url(href, referrer_page=self.SOURCE_TYPE))
# The following code is only legacy code, which I just kep because it doesn't harm. # The following code is only legacy code, which I just kep because it doesn't harm.
# The way ma returns sources changed. # The way ma returns sources changed.
@@ -504,7 +504,7 @@ class EncyclopaediaMetallum(Page):
if url is None: if url is None:
continue continue
source_list.append(Source.match_url(url, referer_page=self.SOURCE_TYPE)) source_list.append(Source.match_url(url, referrer_page=self.SOURCE_TYPE))
return source_list return source_list
@@ -663,7 +663,7 @@ class EncyclopaediaMetallum(Page):
artist.notes = band_notes artist.notes = band_notes
discography: List[Album] = self._fetch_artist_discography(artist_id) discography: List[Album] = self._fetch_artist_discography(artist_id)
artist.main_album_collection.extend(discography) artist.album_collection.extend(discography)
return artist return artist
@@ -832,7 +832,7 @@ class EncyclopaediaMetallum(Page):
) )
def get_source_type(self, source: Source): def get_source_type(self, source: Source):
if self.SOURCE_TYPE != source.page_enum: if self.SOURCE_TYPE != source.source_type:
return None return None
url = source.url url = source.url

View File

@@ -0,0 +1,297 @@
from typing import List, Optional, Type
from urllib.parse import urlparse, urlunparse, urlencode
import json
from enum import Enum
from bs4 import BeautifulSoup
import pycountry
from ..objects import Source, DatabaseObject
from .abstract import Page
from ..objects import (
Artist,
Source,
SourceType,
Song,
Album,
Label,
Target,
Contact,
ID3Timestamp,
Lyrics,
FormattedText,
Artwork,
)
from ..connection import Connection
from ..utils import dump_to_file, traverse_json_path
from ..utils.enums import SourceType, ALL_SOURCE_TYPES
from ..utils.support_classes.download_result import DownloadResult
from ..utils.string_processing import clean_song_title
from ..utils.config import main_settings, logging_settings
from ..utils.shared import DEBUG
if DEBUG:
from ..utils import dump_to_file
class Genius(Page):
SOURCE_TYPE = ALL_SOURCE_TYPES.GENIUS
HOST = "genius.com"
def __init__(self, *args, **kwargs):
self.connection: Connection = Connection(
host="https://genius.com/",
logger=self.LOGGER,
module="genius",
)
super().__init__(*args, **kwargs)
def get_source_type(self, source: Source) -> Optional[Type[DatabaseObject]]:
path = source.parsed_url.path.replace("/", "")
if path.startswith("artists"):
return Artist
if path.startswith("albums"):
return Album
return Song
def add_to_artwork(self, artwork: Artwork, url: str):
if url is None:
return
url_frags = url.split(".")
if len(url_frags) < 2:
artwork.append(url=url)
return
dimensions = url_frags[-2].split("x")
if len(dimensions) < 2:
artwork.append(url=url)
return
if len(dimensions) == 3:
dimensions = dimensions[:-1]
try:
artwork.append(url=url, width=int(dimensions[0]), height=int(dimensions[1]))
except ValueError:
artwork.append(url=url)
def parse_api_object(self, data: dict) -> Optional[DatabaseObject]:
if data is None:
return None
object_type = data.get("_type")
artwork = Artwork()
self.add_to_artwork(artwork, data.get("header_image_url"))
self.add_to_artwork(artwork, data.get("image_url"))
additional_sources: List[Source] = []
source: Source = Source(self.SOURCE_TYPE, data.get("url"), additional_data={
"id": data.get("id"),
"slug": data.get("slug"),
"api_path": data.get("api_path"),
})
notes = FormattedText()
description = data.get("description") or {}
if "html" in description:
notes.html = description["html"]
elif "markdown" in description:
notes.markdown = description["markdown"]
elif "description_preview" in data:
notes.plaintext = data["description_preview"]
if source.url is None:
return None
if object_type == "artist":
if data.get("instagram_name") is not None:
additional_sources.append(Source(ALL_SOURCE_TYPES.INSTAGRAM, f"https://www.instagram.com/{data['instagram_name']}/"))
if data.get("facebook_name") is not None:
additional_sources.append(Source(ALL_SOURCE_TYPES.FACEBOOK, f"https://www.facebook.com/{data['facebook_name']}/"))
if data.get("twitter_name") is not None:
additional_sources.append(Source(ALL_SOURCE_TYPES.TWITTER, f"https://x.com/{data['twitter_name']}/"))
return Artist(
name=data["name"].strip() if data.get("name") is not None else None,
source_list=[source],
artwork=artwork,
notes=notes,
)
if object_type == "album":
self.add_to_artwork(artwork, data.get("cover_art_thumbnail_url"))
self.add_to_artwork(artwork, data.get("cover_art_url"))
for cover_art in data.get("cover_arts", []):
self.add_to_artwork(artwork, cover_art.get("image_url"))
self.add_to_artwork(artwork, cover_art.get("thumbnail_image_url"))
return Album(
title=data.get("name").strip(),
source_list=[source],
artist_list=[self.parse_api_object(data.get("artist"))],
artwork=artwork,
date=ID3Timestamp(**(data.get("release_date_components") or {})),
)
if object_type == "song":
self.add_to_artwork(artwork, data.get("song_art_image_thumbnail_url"))
self.add_to_artwork(artwork, data.get("song_art_image_url"))
main_artist_list = []
featured_artist_list = []
_artist_name = None
primary_artist = self.parse_api_object(data.get("primary_artist"))
if primary_artist is not None:
_artist_name = primary_artist.name
main_artist_list.append(primary_artist)
for feature_artist in (*(data.get("featured_artists") or []), *(data.get("producer_artists") or []), *(data.get("writer_artists") or [])):
artist = self.parse_api_object(feature_artist)
if artist is not None:
featured_artist_list.append(artist)
return Song(
title=clean_song_title(data.get("title"), artist_name=_artist_name),
source_list=[source],
artwork=artwork,
feature_artist_list=featured_artist_list,
artist_list=main_artist_list,
)
return None
def general_search(self, search_query: str, **kwargs) -> List[DatabaseObject]:
results = []
search_params = {
"q": search_query,
}
r = self.connection.get("https://genius.com/api/search/multi?" + urlencode(search_params), name=f"search_{search_query}")
if r is None:
return results
dump_to_file("search_genius.json", r.text, is_json=True, exit_after_dump=False)
data = r.json()
for elements in traverse_json_path(data, "response.sections", default=[]):
hits = elements.get("hits", [])
for hit in hits:
parsed = self.parse_api_object(hit.get("result"))
if parsed is not None:
results.append(parsed)
return results
def fetch_artist(self, source: Source, stop_at_level: int = 1) -> Artist:
artist: Artist = Artist()
# https://genius.com/api/artists/24527/albums?page=1
r = self.connection.get(source.url, name=source.url)
if r is None:
return artist
soup = self.get_soup_from_response(r)
# find the content attribute in the meta tag which is contained in the head
data_container = soup.find("meta", {"itemprop": "page_data"})
if data_container is not None:
content = data_container["content"]
dump_to_file("genius_itemprop_artist.json", content, is_json=True, exit_after_dump=False)
data = json.loads(content)
artist = self.parse_api_object(data.get("artist"))
for e in (data.get("artist_albums") or []):
r = self.parse_api_object(e)
if not isinstance(r, Album):
continue
artist.album_collection.append(r)
for e in (data.get("artist_songs") or []):
r = self.parse_api_object(e)
if not isinstance(r, Song):
continue
"""
TODO
fetch the album for these songs, because the api doesn't
return them
"""
artist.album_collection.extend(r.album_collection)
artist.source_collection.append(source)
return artist
def fetch_album(self, source: Source, stop_at_level: int = 1) -> Album:
album: Album = Album()
# https://genius.com/api/artists/24527/albums?page=1
r = self.connection.get(source.url, name=source.url)
if r is None:
return album
soup = self.get_soup_from_response(r)
# find the content attribute in the meta tag which is contained in the head
data_container = soup.find("meta", {"itemprop": "page_data"})
if data_container is not None:
content = data_container["content"]
dump_to_file("genius_itemprop_album.json", content, is_json=True, exit_after_dump=False)
data = json.loads(content)
album = self.parse_api_object(data.get("album"))
for e in data.get("album_appearances", []):
r = self.parse_api_object(e.get("song"))
if not isinstance(r, Song):
continue
album.song_collection.append(r)
album.source_collection.append(source)
return album
def get_json_content_from_response(self, response, start: str, end: str) -> Optional[str]:
content = response.text
start_index = content.find(start)
if start_index < 0:
return None
start_index += len(start)
end_index = content.find(end, start_index)
if end_index < 0:
return None
return content[start_index:end_index]
def fetch_song(self, source: Source, stop_at_level: int = 1) -> Song:
song: Song = Song()
r = self.connection.get(source.url, name=source.url)
if r is None:
return song
# get the contents that are between `JSON.parse('` and `');`
content = self.get_json_content_from_response(r, start="window.__PRELOADED_STATE__ = JSON.parse('", end="');\n window.__APP_CONFIG__ = ")
if content is not None:
content = content.replace("\\\\", "\\").replace('\\"', '"').replace("\\'", "'")
data = json.loads(content)
lyrics_html = traverse_json_path(data, "songPage.lyricsData.body.html", default=None)
if lyrics_html is not None:
song.lyrics_collection.append(Lyrics(FormattedText(html=lyrics_html)))
dump_to_file("genius_song_script_json.json", content, is_json=True, exit_after_dump=False)
soup = self.get_soup_from_response(r)
for lyrics in soup.find_all("div", {"data-lyrics-container": "true"}):
lyrics_object = Lyrics(FormattedText(html=lyrics.prettify()))
song.lyrics_collection.append(lyrics_object)
song.source_collection.append(source)
return song

View File

@@ -0,0 +1,145 @@
from collections import defaultdict
from dataclasses import dataclass
from enum import Enum
from typing import List, Optional, Type, Union, Generator, Dict, Any
from urllib.parse import urlparse
import pycountry
import musicbrainzngs
from bs4 import BeautifulSoup
from ..connection import Connection
from .abstract import Page
from ..utils.enums import SourceType, ALL_SOURCE_TYPES
from ..utils.enums.album import AlbumType, AlbumStatus
from ..objects import (
Artist,
Source,
Song,
Album,
ID3Timestamp,
FormattedText,
Label,
Target,
DatabaseObject,
Lyrics,
Artwork
)
from ..utils.config import logging_settings, main_settings
from ..utils import string_processing, shared
from ..utils.string_processing import clean_song_title
from ..utils.support_classes.query import Query
from ..utils.support_classes.download_result import DownloadResult
class Musicbrainz(Page):
SOURCE_TYPE = ALL_SOURCE_TYPES.MUSICBRAINZ
HOST = "https://musicbrainz.org"
def __init__(self, *args, **kwargs):
musicbrainzngs.set_useragent("mk", "1")
super().__init__(*args, **kwargs)
def general_search(self, search_query: str) -> List[DatabaseObject]:
search_results = []
#Artist
search_results += self.artist_search(search_query).copy()
#Album
search_results += self.album_search(search_query).copy()
#Song
search_results += self.song_search(search_query).copy()
return search_results
def artist_search(self, search_query: str) -> List[Artist]:
artist_list = []
#Artist
artist_dict_list: list = musicbrainzngs.search_artists(search_query)['artist-list']
artist_source_list: List[Source] = []
for artist_dict in artist_dict_list:
artist_source_list.append(Source(self.SOURCE_TYPE, self.HOST + "/artist/" + artist_dict['id']))
artist_list.append(Artist(
name=artist_dict['name'],
source_list=artist_source_list
))
return artist_list
def song_search(self, search_query: str) -> List[Song]:
song_list = []
#Song
song_dict_list: list = musicbrainzngs.search_recordings(search_query)['recording-list']
song_source_list: List[Source] = []
for song_dict in song_dict_list:
song_source_list.append(Source(self.SOURCE_TYPE, self.HOST + "/recording/" + song_dict['id']))
song_list.append(Song(
title=song_dict['title'],
source_list=song_source_list
))
return song_list
def album_search(self, search_query: str) -> List[Album]:
album_list = []
#Album
album_dict_list: list = musicbrainzngs.search_release_groups(search_query)['release-group-list']
album_source_list: List[Source] = []
for album_dict in album_dict_list:
album_source_list.append(Source(self.SOURCE_TYPE, self.HOST + "/release-group/" + album_dict['id']))
album_list.append(Album(
title=album_dict['title'],
source_list=album_source_list
))
return album_list
def fetch_album(self, source: Source, stop_at_level: int = 1) -> Album:
album_list = []
#Album
album_dict_list: list = musicbrainzngs.search_release_groups(search_query)['release-group-list']
album_source_list: List[Source] = []
for album_dict in album_dict_list:
album_source_list.append(Source(self.SOURCE_TYPE, self.HOST + "/release-group/" + album_dict['id']))
album_list.append(Album(
title=album_dict['title'],
source_list=album_source_list
))
def fetch_artist(self, source: Source, stop_at_level: int = 1) -> Artist:
artist_list = []
#Artist
artist_dict_list: list = musicbrainzngs.search_artists(search_query)['artist-list']
artist_source_list: List[Source] = []
for artist_dict in artist_dict_list:
artist_source_list.append(Source(self.SOURCE_TYPE, self.HOST + "/artist/" + artist_dict['id']))
artist_list.append(Artist(
name=artist_dict['name'],
source_list=artist_source_list,
))
def fetch_song(self, source: Source, stop_at_level: int = 1) -> Song:
song_list = []
#Song
song_dict_list: list = musicbrainzngs.search_recordings(search_query)['recording-list']
song_source_list: List[Source] = []
for song_dict in song_dict_list:
song_source_list.append(Source(self.SOURCE_TYPE, self.HOST + "/recording/" + song_dict['id']))
song_list.append(Song(
title=song_dict['title'],
source_list=song_source_list
))

View File

@@ -1,7 +1,7 @@
from collections import defaultdict from collections import defaultdict
from dataclasses import dataclass from dataclasses import dataclass
from enum import Enum from enum import Enum
from typing import List, Optional, Type, Union, Generator from typing import List, Optional, Type, Union, Generator, Dict, Any
from urllib.parse import urlparse from urllib.parse import urlparse
import pycountry import pycountry
@@ -9,7 +9,7 @@ from bs4 import BeautifulSoup
from ..connection import Connection from ..connection import Connection
from .abstract import Page from .abstract import Page
from ..utils.enums.source import SourcePages from ..utils.enums import SourceType, ALL_SOURCE_TYPES
from ..utils.enums.album import AlbumType, AlbumStatus from ..utils.enums.album import AlbumType, AlbumStatus
from ..objects import ( from ..objects import (
Artist, Artist,
@@ -24,7 +24,7 @@ from ..objects import (
Lyrics, Lyrics,
Artwork Artwork
) )
from ..utils.config import logging_settings from ..utils.config import logging_settings, main_settings
from ..utils import string_processing, shared from ..utils import string_processing, shared
from ..utils.string_processing import clean_song_title from ..utils.string_processing import clean_song_title
from ..utils.support_classes.query import Query from ..utils.support_classes.query import Query
@@ -111,9 +111,7 @@ def parse_url(url: str) -> MusifyUrl:
class Musify(Page): class Musify(Page):
# CHANGE SOURCE_TYPE = ALL_SOURCE_TYPES.MUSIFY
SOURCE_TYPE = SourcePages.MUSIFY
LOGGER = logging_settings["musify_logger"]
HOST = "https://musify.club" HOST = "https://musify.club"
@@ -361,7 +359,7 @@ class Musify(Page):
return Song( return Song(
title=clean_song_title(song_title, artist_name=artist_list[0].name if len(artist_list) > 0 else None), title=clean_song_title(song_title, artist_name=artist_list[0].name if len(artist_list) > 0 else None),
main_artist_list=artist_list, feature_artist_list=artist_list,
source_list=source_list source_list=source_list
) )
@@ -418,6 +416,10 @@ class Musify(Page):
href = artist_soup["href"] href = artist_soup["href"]
if href is not None: if href is not None:
href_parts = href.split("/")
if len(href_parts) <= 1 or href_parts[-2] != "artist":
return
artist_src_list.append(Source(self.SOURCE_TYPE, self.HOST + href)) artist_src_list.append(Source(self.SOURCE_TYPE, self.HOST + href))
name_elem: BeautifulSoup = artist_soup.find("span", {"itemprop": "name"}) name_elem: BeautifulSoup = artist_soup.find("span", {"itemprop": "name"})
@@ -500,17 +502,26 @@ class Musify(Page):
for video_container in video_container_list: for video_container in video_container_list:
iframe_list: List[BeautifulSoup] = video_container.findAll("iframe") iframe_list: List[BeautifulSoup] = video_container.findAll("iframe")
for iframe in iframe_list: for iframe in iframe_list:
"""
the url could look like this
https://www.youtube.com/embed/sNObCkhzOYA?si=dNVgnZMBNVlNb0P_
"""
parsed_url = urlparse(iframe["src"])
path_parts = parsed_url.path.strip("/").split("/")
if path_parts[0] != "embed" or len(path_parts) < 2:
continue
source_list.append(Source( source_list.append(Source(
SourcePages.YOUTUBE, ALL_SOURCE_TYPES.YOUTUBE,
iframe["src"], f"https://music.youtube.com/watch?v={path_parts[1]}",
referer_page=self.SOURCE_TYPE referrer_page=self.SOURCE_TYPE
)) ))
return Song( return Song(
title=clean_song_title(track_name, artist_name=artist_list[0].name if len(artist_list) > 0 else None), title=clean_song_title(track_name, artist_name=artist_list[0].name if len(artist_list) > 0 else None),
source_list=source_list, source_list=source_list,
lyrics_list=lyrics_list, lyrics_list=lyrics_list,
main_artist_list=artist_list, feature_artist_list=artist_list,
album_list=album_list, album_list=album_list,
artwork=artwork, artwork=artwork,
) )
@@ -652,10 +663,104 @@ class Musify(Page):
return Song( return Song(
title=clean_song_title(song_name, artist_name=artist_list[0].name if len(artist_list) > 0 else None), title=clean_song_title(song_name, artist_name=artist_list[0].name if len(artist_list) > 0 else None),
tracksort=tracksort, tracksort=tracksort,
main_artist_list=artist_list, feature_artist_list=artist_list,
source_list=source_list source_list=source_list
) )
def _parse_album(self, soup: BeautifulSoup) -> Album:
name: str = None
source_list: List[Source] = []
artist_list: List[Artist] = []
date: ID3Timestamp = None
"""
if breadcrumb list has 4 elements, then
the -2 is the artist link,
the -1 is the album
"""
# breadcrumb
breadcrumb_soup: BeautifulSoup = soup.find("ol", {"class", "breadcrumb"})
breadcrumb_elements: List[BeautifulSoup] = breadcrumb_soup.find_all("li", {"class": "breadcrumb-item"})
if len(breadcrumb_elements) == 4:
# album
album_crumb: BeautifulSoup = breadcrumb_elements[-1]
name = album_crumb.text.strip()
# artist
artist_crumb: BeautifulSoup = breadcrumb_elements[-2]
anchor: BeautifulSoup = artist_crumb.find("a")
if anchor is not None:
href = anchor.get("href")
href_parts = href.split("/")
if not(len(href_parts) <= 1 or href_parts[-2] != "artist"):
artist_source_list: List[Source] = []
if href is not None:
artist_source_list.append(Source(self.SOURCE_TYPE, self.HOST + href.strip()))
span: BeautifulSoup = anchor.find("span")
if span is not None:
artist_list.append(Artist(
name=span.get_text(strip=True),
source_list=artist_source_list
))
else:
self.LOGGER.debug("there are not 4 breadcrumb items, which shouldn't be the case")
# meta
meta_url: BeautifulSoup = soup.find("meta", {"itemprop": "url"})
if meta_url is not None:
url = meta_url.get("content")
if url is not None:
source_list.append(Source(self.SOURCE_TYPE, self.HOST + url))
meta_name: BeautifulSoup = soup.find("meta", {"itemprop": "name"})
if meta_name is not None:
_name = meta_name.get("content")
if _name is not None:
name = _name
# album info
album_info_ul: BeautifulSoup = soup.find("ul", {"class": "album-info"})
if album_info_ul is not None:
artist_anchor: BeautifulSoup
for artist_anchor in album_info_ul.find_all("a", {"itemprop": "byArtist"}):
# line 98
artist_source_list: List[Source] = []
artist_url_meta = artist_anchor.find("meta", {"itemprop": "url"})
if artist_url_meta is not None:
artist_href = artist_url_meta.get("content")
if artist_href is not None:
artist_source_list.append(Source(self.SOURCE_TYPE, url=self.HOST + artist_href))
artist_meta_name = artist_anchor.find("meta", {"itemprop": "name"})
if artist_meta_name is not None:
artist_name = artist_meta_name.get("content")
if artist_name is not None:
artist_list.append(Artist(
name=artist_name,
source_list=artist_source_list
))
time_soup: BeautifulSoup = album_info_ul.find("time", {"itemprop": "datePublished"})
if time_soup is not None:
raw_datetime = time_soup.get("datetime")
if raw_datetime is not None:
try:
date = ID3Timestamp.strptime(raw_datetime, "%Y-%m-%d")
except ValueError:
self.LOGGER.debug(f"Raw datetime doesn't match time format %Y-%m-%d: {raw_datetime}")
return Album(
title=name,
source_list=source_list,
artist_list=artist_list,
date=date
)
def fetch_album(self, source: Source, stop_at_level: int = 1) -> Album: def fetch_album(self, source: Source, stop_at_level: int = 1) -> Album:
""" """
fetches album from source: fetches album from source:
@@ -690,30 +795,18 @@ class Musify(Page):
new_song = self._parse_song_card(card_soup) new_song = self._parse_song_card(card_soup)
album.song_collection.append(new_song) album.song_collection.append(new_song)
if stop_at_level > 1:
song: Song
for song in album.song_collection:
sources = song.source_collection.get_sources_from_page(self.SOURCE_TYPE)
for source in sources:
song.merge(self.fetch_song(source=source))
album.update_tracksort() album.update_tracksort()
return album return album
def _get_artist_attributes(self, url: MusifyUrl) -> Artist: def _fetch_initial_artist(self, url: MusifyUrl, source: Source, **kwargs) -> Artist:
""" """
fetches the main Artist attributes from this endpoint
https://musify.club/artist/ghost-bath-280348?_pjax=#bodyContent https://musify.club/artist/ghost-bath-280348?_pjax=#bodyContent
it needs to parse html
:param url:
:return:
""" """
r = self.connection.get(f"https://musify.club/{url.source_type.value}/{url.name_with_id}?_pjax=#bodyContent", name="artist_attributes_" + url.name_with_id) r = self.connection.get(f"https://musify.club/{url.source_type.value}/{url.name_with_id}?_pjax=#bodyContent", name="artist_attributes_" + url.name_with_id)
if r is None: if r is None:
return Artist() return Artist(source_list=[source])
soup = self.get_soup_from_response(r) soup = self.get_soup_from_response(r)
@@ -812,7 +905,7 @@ class Musify(Page):
href = additional_source.get("href") href = additional_source.get("href")
if href is None: if href is None:
continue continue
new_src = Source.match_url(href, referer_page=self.SOURCE_TYPE) new_src = Source.match_url(href, referrer_page=self.SOURCE_TYPE)
if new_src is None: if new_src is None:
continue continue
source_list.append(new_src) source_list.append(new_src)
@@ -828,7 +921,7 @@ class Musify(Page):
notes=notes notes=notes
) )
def _parse_album_card(self, album_card: BeautifulSoup, artist_name: str = None) -> Album: def _parse_album_card(self, album_card: BeautifulSoup, artist_name: str = None, **kwargs) -> Album:
""" """
<div class="card release-thumbnail" data-type="2"> <div class="card release-thumbnail" data-type="2">
<a href="/release/ghost-bath-self-loather-2021-1554266"> <a href="/release/ghost-bath-self-loather-2021-1554266">
@@ -852,46 +945,20 @@ class Musify(Page):
</div> </div>
""" """
_id: Optional[str] = None album_kwargs: Dict[str, Any] = {
name: str = None "source_list": [],
source_list: List[Source] = [] }
timestamp: Optional[ID3Timestamp] = None
album_status = None
def set_name(new_name: str):
nonlocal name
nonlocal artist_name
# example of just setting not working:
# https://musify.club/release/unjoy-eurythmie-psychonaut-4-tired-numb-still-alive-2012-324067
if new_name.count(" - ") != 1:
name = new_name
return
potential_artist_list, potential_name = new_name.split(" - ")
unified_artist_list = string_processing.unify(potential_artist_list)
if artist_name is not None:
if string_processing.unify(artist_name) not in unified_artist_list:
name = new_name
return
name = potential_name
return
name = new_name
album_status_id = album_card.get("data-type") album_status_id = album_card.get("data-type")
if album_status_id.isdigit(): if album_status_id.isdigit():
album_status_id = int(album_status_id) album_status_id = int(album_status_id)
album_type = ALBUM_TYPE_MAP[album_status_id] album_kwargs["album_type"] = ALBUM_TYPE_MAP[album_status_id]
if album_status_id == 5: if album_status_id == 5:
album_status = AlbumStatus.BOOTLEG album_kwargs["album_status"] = AlbumStatus.BOOTLEG
def parse_release_anchor(_anchor: BeautifulSoup, text_is_name=False): def parse_release_anchor(_anchor: BeautifulSoup, text_is_name=False):
nonlocal _id nonlocal album_kwargs
nonlocal name
nonlocal source_list
if _anchor is None: if _anchor is None:
return return
@@ -899,20 +966,13 @@ class Musify(Page):
href = _anchor.get("href") href = _anchor.get("href")
if href is not None: if href is not None:
# add url to sources # add url to sources
source_list.append(Source( album_kwargs["source_list"].append(Source(
self.SOURCE_TYPE, self.SOURCE_TYPE,
self.HOST + href self.HOST + href
)) ))
# split id from url if text_is_name:
split_href = href.split("-") album_kwargs["title"] = clean_song_title(_anchor.text, artist_name)
if len(split_href) > 1:
_id = split_href[-1]
if not text_is_name:
return
set_name(_anchor.text)
anchor_list = album_card.find_all("a", recursive=False) anchor_list = album_card.find_all("a", recursive=False)
if len(anchor_list) > 0: if len(anchor_list) > 0:
@@ -923,7 +983,7 @@ class Musify(Page):
if thumbnail is not None: if thumbnail is not None:
alt = thumbnail.get("alt") alt = thumbnail.get("alt")
if alt is not None: if alt is not None:
set_name(alt) album_kwargs["title"] = clean_song_title(alt, artist_name)
image_url = thumbnail.get("src") image_url = thumbnail.get("src")
else: else:
@@ -940,7 +1000,7 @@ class Musify(Page):
13.11.2021 13.11.2021
</small> </small>
""" """
nonlocal timestamp nonlocal album_kwargs
italic_tagging_soup: BeautifulSoup = small_soup.find("i") italic_tagging_soup: BeautifulSoup = small_soup.find("i")
if italic_tagging_soup is None: if italic_tagging_soup is None:
@@ -950,7 +1010,7 @@ class Musify(Page):
return return
raw_time = small_soup.text.strip() raw_time = small_soup.text.strip()
timestamp = ID3Timestamp.strptime(raw_time, "%d.%m.%Y") album_kwargs["date"] = ID3Timestamp.strptime(raw_time, "%d.%m.%Y")
# parse small date # parse small date
card_footer_list = album_card.find_all("div", {"class": "card-footer"}) card_footer_list = album_card.find_all("div", {"class": "card-footer"})
@@ -963,112 +1023,18 @@ class Musify(Page):
else: else:
self.LOGGER.debug("there is not even 1 footer in the album card") self.LOGGER.debug("there is not even 1 footer in the album card")
return Album( return Album(**album_kwargs)
title=name,
source_list=source_list,
date=timestamp,
album_type=album_type,
album_status=album_status
)
def _parse_album(self, soup: BeautifulSoup) -> Album: def _fetch_artist_discography(self, artist: Artist, url: MusifyUrl, artist_name: str = None, **kwargs):
name: str = None
source_list: List[Source] = []
artist_list: List[Artist] = []
date: ID3Timestamp = None
"""
if breadcrumb list has 4 elements, then
the -2 is the artist link,
the -1 is the album
"""
# breadcrumb
breadcrumb_soup: BeautifulSoup = soup.find("ol", {"class", "breadcrumb"})
breadcrumb_elements: List[BeautifulSoup] = breadcrumb_soup.find_all("li", {"class": "breadcrumb-item"})
if len(breadcrumb_elements) == 4:
# album
album_crumb: BeautifulSoup = breadcrumb_elements[-1]
name = album_crumb.text.strip()
# artist
artist_crumb: BeautifulSoup = breadcrumb_elements[-2]
anchor: BeautifulSoup = artist_crumb.find("a")
if anchor is not None:
href = anchor.get("href")
artist_source_list: List[Source] = []
if href is not None:
artist_source_list.append(Source(self.SOURCE_TYPE, self.HOST + href.strip()))
span: BeautifulSoup = anchor.find("span")
if span is not None:
artist_list.append(Artist(
name=span.get_text(strip=True),
source_list=artist_source_list
))
else:
self.LOGGER.debug("there are not 4 breadcrumb items, which shouldn't be the case")
# meta
meta_url: BeautifulSoup = soup.find("meta", {"itemprop": "url"})
if meta_url is not None:
url = meta_url.get("content")
if url is not None:
source_list.append(Source(self.SOURCE_TYPE, self.HOST + url))
meta_name: BeautifulSoup = soup.find("meta", {"itemprop": "name"})
if meta_name is not None:
_name = meta_name.get("content")
if _name is not None:
name = _name
# album info
album_info_ul: BeautifulSoup = soup.find("ul", {"class": "album-info"})
if album_info_ul is not None:
artist_anchor: BeautifulSoup
for artist_anchor in album_info_ul.find_all("a", {"itemprop": "byArtist"}):
# line 98
artist_source_list: List[Source] = []
artist_url_meta = artist_anchor.find("meta", {"itemprop": "url"})
if artist_url_meta is not None:
artist_href = artist_url_meta.get("content")
if artist_href is not None:
artist_source_list.append(Source(self.SOURCE_TYPE, url=self.HOST + artist_href))
artist_meta_name = artist_anchor.find("meta", {"itemprop": "name"})
if artist_meta_name is not None:
artist_name = artist_meta_name.get("content")
if artist_name is not None:
artist_list.append(Artist(
name=artist_name,
source_list=artist_source_list
))
time_soup: BeautifulSoup = album_info_ul.find("time", {"itemprop": "datePublished"})
if time_soup is not None:
raw_datetime = time_soup.get("datetime")
if raw_datetime is not None:
try:
date = ID3Timestamp.strptime(raw_datetime, "%Y-%m-%d")
except ValueError:
self.LOGGER.debug(f"Raw datetime doesn't match time format %Y-%m-%d: {raw_datetime}")
return Album(
title=name,
source_list=source_list,
artist_list=artist_list,
date=date
)
def _get_discography(self, url: MusifyUrl, artist_name: str = None, stop_at_level: int = 1) -> Generator[Album, None, None]:
""" """
POST https://musify.club/artist/filteralbums POST https://musify.club/artist/filteralbums
ArtistID: 280348 ArtistID: 280348
SortOrder.Property: dateCreated SortOrder.Property: dateCreated
SortOrder.IsAscending: false SortOrder.IsAscending: false
X-Requested-With: XMLHttpRequest X-Requested-With: XMLHttpRequest
""" """
_download_all = kwargs.get("download_all", False)
_album_type_blacklist = kwargs.get("album_type_blacklist", main_settings["album_type_blacklist"])
endpoint = self.HOST + "/" + url.source_type.value + "/filteralbums" endpoint = self.HOST + "/" + url.source_type.value + "/filteralbums"
@@ -1079,33 +1045,29 @@ class Musify(Page):
"X-Requested-With": "XMLHttpRequest" "X-Requested-With": "XMLHttpRequest"
}, name="discography_" + url.name_with_id) }, name="discography_" + url.name_with_id)
if r is None: if r is None:
return [] return
soup: BeautifulSoup = BeautifulSoup(r.content, features="html.parser")
soup: BeautifulSoup = self.get_soup_from_response(r)
for card_soup in soup.find_all("div", {"class": "card"}): for card_soup in soup.find_all("div", {"class": "card"}):
yield self._parse_album_card(card_soup, artist_name) album = self._parse_album_card(card_soup, artist_name, **kwargs)
if not self.fetch_options.download_all and album.album_type in self.fetch_options.album_type_blacklist:
continue
def fetch_artist(self, source: Source, stop_at_level: int = 1) -> Artist: artist.album_collection.append(album)
def fetch_artist(self, source: Source, **kwargs) -> Artist:
""" """
fetches artist from source TODO
[x] discography [x] discography
[x] attributes [x] attributes
[] picture gallery [] picture gallery
Args:
source (Source): the source to fetch
stop_at_level: int = 1: if it is false, every album from discograohy will be fetched. Defaults to False.
Returns:
Artist: the artist fetched
""" """
url = parse_url(source.url) url = parse_url(source.url)
artist = self._get_artist_attributes(url) artist = self._fetch_initial_artist(url, source=source, **kwargs)
self._fetch_artist_discography(artist, url, artist.name, **kwargs)
artist.main_album_collection.extend(self._get_discography(url, artist.name))
return artist return artist

View File

@@ -1,65 +0,0 @@
from typing import List, Optional, Type
from urllib.parse import urlparse
import logging
from ..objects import Source, DatabaseObject
from .abstract import Page
from ..objects import (
Artist,
Source,
SourcePages,
Song,
Album,
Label,
Target
)
from ..connection import Connection
from ..utils.support_classes.query import Query
from ..utils.support_classes.download_result import DownloadResult
class Preset(Page):
# CHANGE
SOURCE_TYPE = SourcePages.PRESET
LOGGER = logging.getLogger("preset")
def __init__(self, *args, **kwargs):
self.connection: Connection = Connection(
host="https://www.preset.cum/",
logger=self.LOGGER
)
super().__init__(*args, **kwargs)
def get_source_type(self, source: Source) -> Optional[Type[DatabaseObject]]:
return super().get_source_type(source)
def general_search(self, search_query: str) -> List[DatabaseObject]:
return []
def label_search(self, label: Label) -> List[Label]:
return []
def artist_search(self, artist: Artist) -> List[Artist]:
return []
def album_search(self, album: Album) -> List[Album]:
return []
def song_search(self, song: Song) -> List[Song]:
return []
def fetch_song(self, source: Source, stop_at_level: int = 1) -> Song:
return Song()
def fetch_album(self, source: Source, stop_at_level: int = 1) -> Album:
return Album()
def fetch_artist(self, source: Source, stop_at_level: int = 1) -> Artist:
return Artist()
def fetch_label(self, source: Source, stop_at_level: int = 1) -> Label:
return Label()
def download_song_to_target(self, source: Source, target: Target, desc: str = None) -> DownloadResult:
return DownloadResult()

View File

@@ -9,7 +9,6 @@ from .abstract import Page
from ..objects import ( from ..objects import (
Artist, Artist,
Source, Source,
SourcePages,
Song, Song,
Album, Album,
Label, Label,
@@ -19,6 +18,7 @@ from ..objects import (
) )
from ..connection import Connection from ..connection import Connection
from ..utils.string_processing import clean_song_title from ..utils.string_processing import clean_song_title
from ..utils.enums import SourceType, ALL_SOURCE_TYPES
from ..utils.support_classes.download_result import DownloadResult from ..utils.support_classes.download_result import DownloadResult
from ..utils.config import youtube_settings, main_settings, logging_settings from ..utils.config import youtube_settings, main_settings, logging_settings
@@ -39,10 +39,7 @@ def get_piped_url(path: str = "", params: str = "", query: str = "", fragment: s
class YouTube(SuperYouTube): class YouTube(SuperYouTube):
# CHANGE # CHANGE
SOURCE_TYPE = SourcePages.YOUTUBE SOURCE_TYPE = ALL_SOURCE_TYPES.YOUTUBE
LOGGER = logging_settings["youtube_logger"]
NO_ADDITIONAL_DATA_FROM_SONG = True
def __init__(self, *args, **kwargs): def __init__(self, *args, **kwargs):
self.connection: Connection = Connection( self.connection: Connection = Connection(
@@ -146,7 +143,7 @@ class YouTube(SuperYouTube):
self.SOURCE_TYPE, get_invidious_url(path="/watch", query=f"v={data['videoId']}") self.SOURCE_TYPE, get_invidious_url(path="/watch", query=f"v={data['videoId']}")
)], )],
notes=FormattedText(html=data["descriptionHtml"] + f"\n<p>{license_str}</ p>" ), notes=FormattedText(html=data["descriptionHtml"] + f"\n<p>{license_str}</ p>" ),
main_artist_list=artist_list artist_list=artist_list
), int(data["published"]) ), int(data["published"])
def fetch_song(self, source: Source, stop_at_level: int = 1) -> Song: def fetch_song(self, source: Source, stop_at_level: int = 1) -> Song:
@@ -287,7 +284,7 @@ class YouTube(SuperYouTube):
self.LOGGER.warning(f"didn't found any playlists with piped, falling back to invidious. (it is unusual)") self.LOGGER.warning(f"didn't found any playlists with piped, falling back to invidious. (it is unusual)")
album_list, artist_name = self.fetch_invidious_album_list(parsed.id) album_list, artist_name = self.fetch_invidious_album_list(parsed.id)
return Artist(name=artist_name, main_album_list=album_list, source_list=[source]) return Artist(name=artist_name, album_list=album_list, source_list=[source])
def download_song_to_target(self, source: Source, target: Target, desc: str = None) -> DownloadResult: def download_song_to_target(self, source: Source, target: Target, desc: str = None) -> DownloadResult:
""" """

View File

@@ -7,7 +7,6 @@ from ..abstract import Page
from ...objects import ( from ...objects import (
Artist, Artist,
Source, Source,
SourcePages,
Song, Song,
Album, Album,
Label, Label,
@@ -25,7 +24,6 @@ def music_card_shelf_renderer(renderer: dict) -> List[DatabaseObject]:
results.extend(parse_renderer(sub_renderer)) results.extend(parse_renderer(sub_renderer))
return results return results
def music_responsive_list_item_flex_column_renderer(renderer: dict) -> List[DatabaseObject]: def music_responsive_list_item_flex_column_renderer(renderer: dict) -> List[DatabaseObject]:
return parse_run_list(renderer.get("text", {}).get("runs", [])) return parse_run_list(renderer.get("text", {}).get("runs", []))
@@ -54,19 +52,24 @@ def music_responsive_list_item_renderer(renderer: dict) -> List[DatabaseObject]:
for result in results: for result in results:
_map[type(result)].append(result) _map[type(result)].append(result)
for song in song_list: if len(song_list) == 1:
song = song_list[0]
song.feature_artist_collection.extend(artist_list)
song.album_collection.extend(album_list) song.album_collection.extend(album_list)
song.main_artist_collection.extend(artist_list) return [song]
for album in album_list: if len(album_list) == 1:
album = album_list[0]
album.artist_collection.extend(artist_list) album.artist_collection.extend(artist_list)
album.song_collection.extend(song_list)
return [album]
if len(song_list) > 0: """
return song_list if len(artist_list) == 1:
if len(album_list) > 0: artist = artist_list[0]
return album_list artist.main_album_collection.extend(album_list)
if len(artist_list) > 0: return [artist]
return artist_list """
return results return results

View File

@@ -3,12 +3,13 @@ from enum import Enum
from ...utils.config import youtube_settings, logging_settings from ...utils.config import youtube_settings, logging_settings
from ...utils.string_processing import clean_song_title from ...utils.string_processing import clean_song_title
from ...utils.enums import SourceType, ALL_SOURCE_TYPES
from ...objects import Source, DatabaseObject from ...objects import Source, DatabaseObject
from ..abstract import Page from ..abstract import Page
from ...objects import ( from ...objects import (
Artist, Artist,
Source, Source,
SourcePages,
Song, Song,
Album, Album,
Label, Label,
@@ -18,7 +19,7 @@ from ...objects import (
LOGGER = logging_settings["youtube_music_logger"] LOGGER = logging_settings["youtube_music_logger"]
SOURCE_PAGE = SourcePages.YOUTUBE_MUSIC SOURCE_PAGE = ALL_SOURCE_TYPES.YOUTUBE
class PageType(Enum): class PageType(Enum):
@@ -40,7 +41,7 @@ def parse_run_element(run_element: dict) -> Optional[DatabaseObject]:
_temp_nav = run_element.get("navigationEndpoint", {}) _temp_nav = run_element.get("navigationEndpoint", {})
is_video = "watchEndpoint" in _temp_nav is_video = "watchEndpoint" in _temp_nav
navigation_endpoint = _temp_nav.get("watchEndpoint" if is_video else "browseEndpoint", {}) navigation_endpoint = _temp_nav.get("watchEndpoint", _temp_nav.get("browseEndpoint", {}))
element_type = PageType.SONG element_type = PageType.SONG
page_type_string = navigation_endpoint.get("watchEndpointMusicSupportedConfigs", {}).get("watchEndpointMusicConfig", {}).get("musicVideoType", "") page_type_string = navigation_endpoint.get("watchEndpointMusicSupportedConfigs", {}).get("watchEndpointMusicConfig", {}).get("musicVideoType", "")
@@ -51,7 +52,7 @@ def parse_run_element(run_element: dict) -> Optional[DatabaseObject]:
except ValueError: except ValueError:
return return
element_id = navigation_endpoint.get("videoId" if is_video else "browseId") element_id = navigation_endpoint.get("videoId", navigation_endpoint.get("browseId"))
element_text = run_element.get("text") element_text = run_element.get("text")
if element_id is None or element_text is None: if element_id is None or element_text is None:
@@ -60,7 +61,11 @@ def parse_run_element(run_element: dict) -> Optional[DatabaseObject]:
if element_type == PageType.SONG or (element_type == PageType.VIDEO and not youtube_settings["youtube_music_clean_data"]) or (element_type == PageType.OFFICIAL_MUSIC_VIDEO and not youtube_settings["youtube_music_clean_data"]): if element_type == PageType.SONG or (element_type == PageType.VIDEO and not youtube_settings["youtube_music_clean_data"]) or (element_type == PageType.OFFICIAL_MUSIC_VIDEO and not youtube_settings["youtube_music_clean_data"]):
source = Source(SOURCE_PAGE, f"https://music.youtube.com/watch?v={element_id}") source = Source(SOURCE_PAGE, f"https://music.youtube.com/watch?v={element_id}")
return Song(title=clean_song_title(element_text), source_list=[source])
return Song(
title=clean_song_title(element_text),
source_list=[source]
)
if element_type == PageType.ARTIST or (element_type == PageType.CHANNEL and not youtube_settings["youtube_music_clean_data"]): if element_type == PageType.ARTIST or (element_type == PageType.CHANNEL and not youtube_settings["youtube_music_clean_data"]):
source = Source(SOURCE_PAGE, f"https://music.youtube.com/channel/{element_id}") source = Source(SOURCE_PAGE, f"https://music.youtube.com/channel/{element_id}")

View File

@@ -10,7 +10,6 @@ from ..abstract import Page
from ...objects import ( from ...objects import (
Artist, Artist,
Source, Source,
SourcePages,
Song, Song,
Album, Album,
Label, Label,
@@ -21,6 +20,7 @@ from ...objects import (
from ...connection import Connection from ...connection import Connection
from ...utils.support_classes.download_result import DownloadResult from ...utils.support_classes.download_result import DownloadResult
from ...utils.config import youtube_settings, logging_settings, main_settings from ...utils.config import youtube_settings, logging_settings, main_settings
from ...utils.enums import SourceType, ALL_SOURCE_TYPES
def get_invidious_url(path: str = "", params: str = "", query: str = "", fragment: str = "") -> str: def get_invidious_url(path: str = "", params: str = "", query: str = "", fragment: str = "") -> str:
@@ -50,7 +50,7 @@ class YouTubeUrl:
""" """
def __init__(self, url: str) -> None: def __init__(self, url: str) -> None:
self.SOURCE_TYPE = SourcePages.YOUTUBE self.SOURCE_TYPE = ALL_SOURCE_TYPES.YOUTUBE
""" """
Raises Index exception for wrong url, and value error for not found enum type Raises Index exception for wrong url, and value error for not found enum type
@@ -58,9 +58,6 @@ class YouTubeUrl:
self.id = "" self.id = ""
parsed = urlparse(url=url) parsed = urlparse(url=url)
if parsed.netloc == "music.youtube.com":
self.SOURCE_TYPE = SourcePages.YOUTUBE_MUSIC
self.url_type: YouTubeUrlType self.url_type: YouTubeUrlType
type_frag_list = parsed.path.split("/") type_frag_list = parsed.path.split("/")
@@ -124,8 +121,7 @@ class YouTubeUrl:
class SuperYouTube(Page): class SuperYouTube(Page):
# CHANGE # CHANGE
SOURCE_TYPE = SourcePages.YOUTUBE SOURCE_TYPE = ALL_SOURCE_TYPES.YOUTUBE
LOGGER = logging_settings["youtube_logger"]
NO_ADDITIONAL_DATA_FROM_SONG = False NO_ADDITIONAL_DATA_FROM_SONG = False
@@ -145,6 +141,8 @@ class SuperYouTube(Page):
_sponsorblock_connection: Connection = Connection() _sponsorblock_connection: Connection = Connection()
self.sponsorblock = python_sponsorblock.SponsorBlock(silent=True, session=_sponsorblock_connection.session) self.sponsorblock = python_sponsorblock.SponsorBlock(silent=True, session=_sponsorblock_connection.session)
super().__init__(*args, **kwargs)
def get_source_type(self, source: Source) -> Optional[Type[DatabaseObject]]: def get_source_type(self, source: Source) -> Optional[Type[DatabaseObject]]:
_url_type = { _url_type = {
YouTubeUrlType.CHANNEL: Artist, YouTubeUrlType.CHANNEL: Artist,

View File

@@ -8,6 +8,7 @@ import json
from dataclasses import dataclass from dataclasses import dataclass
import re import re
from functools import lru_cache from functools import lru_cache
from collections import defaultdict
import youtube_dl import youtube_dl
from youtube_dl.extractor.youtube import YoutubeIE from youtube_dl.extractor.youtube import YoutubeIE
@@ -17,25 +18,31 @@ from ...utils.exception.config import SettingValueError
from ...utils.config import main_settings, youtube_settings, logging_settings from ...utils.config import main_settings, youtube_settings, logging_settings
from ...utils.shared import DEBUG, DEBUG_YOUTUBE_INITIALIZING from ...utils.shared import DEBUG, DEBUG_YOUTUBE_INITIALIZING
from ...utils.string_processing import clean_song_title from ...utils.string_processing import clean_song_title
from ...utils import get_current_millis from ...utils import get_current_millis, traverse_json_path
from ...utils import dump_to_file from ...utils import dump_to_file
from ...objects import Source, DatabaseObject, ID3Timestamp, Artwork
from ..abstract import Page from ..abstract import Page
from ...objects import ( from ...objects import (
Artist, DatabaseObject as DataObject,
Source, Source,
SourcePages, FormattedText,
ID3Timestamp,
Artwork,
Artist,
Song, Song,
Album, Album,
Label, Label,
Target Target,
Lyrics,
) )
from ...connection import Connection from ...connection import Connection
from ...utils.enums import SourceType, ALL_SOURCE_TYPES
from ...utils.enums.album import AlbumType
from ...utils.support_classes.download_result import DownloadResult from ...utils.support_classes.download_result import DownloadResult
from ._list_render import parse_renderer from ._list_render import parse_renderer
from ._music_object_render import parse_run_element
from .super_youtube import SuperYouTube from .super_youtube import SuperYouTube
@@ -162,11 +169,16 @@ class MusicKrakenYoutubeIE(YoutubeIE):
ALBUM_TYPE_MAP = {
"Single": AlbumType.SINGLE,
"Album": AlbumType.STUDIO_ALBUM,
"EP": AlbumType.EP,
}
class YoutubeMusic(SuperYouTube): class YoutubeMusic(SuperYouTube):
# CHANGE # CHANGE
SOURCE_TYPE = SourcePages.YOUTUBE_MUSIC SOURCE_TYPE = ALL_SOURCE_TYPES.YOUTUBE
LOGGER = logging_settings["youtube_music_logger"]
def __init__(self, *args, ydl_opts: dict = None, **kwargs): def __init__(self, *args, ydl_opts: dict = None, **kwargs):
self.yt_music_connection: YoutubeMusicConnection = YoutubeMusicConnection( self.yt_music_connection: YoutubeMusicConnection = YoutubeMusicConnection(
@@ -182,8 +194,7 @@ class YoutubeMusic(SuperYouTube):
self.start_millis = get_current_millis() self.start_millis = get_current_millis()
if self.credentials.api_key == "" or DEBUG_YOUTUBE_INITIALIZING: self._fetch_from_main_page()
self._fetch_from_main_page()
SuperYouTube.__init__(self, *args, **kwargs) SuperYouTube.__init__(self, *args, **kwargs)
@@ -204,6 +215,8 @@ class YoutubeMusic(SuperYouTube):
self.download_values_by_url: dict = {} self.download_values_by_url: dict = {}
self.not_download: Dict[str, DownloadError] = {} self.not_download: Dict[str, DownloadError] = {}
super().__init__(*args, **kwargs)
def _fetch_from_main_page(self): def _fetch_from_main_page(self):
""" """
===API=KEY=== ===API=KEY===
@@ -336,10 +349,10 @@ class YoutubeMusic(SuperYouTube):
default='{}' default='{}'
)) or {} )) or {}
def get_source_type(self, source: Source) -> Optional[Type[DatabaseObject]]: def get_source_type(self, source: Source) -> Optional[Type[DataObject]]:
return super().get_source_type(source) return super().get_source_type(source)
def general_search(self, search_query: str) -> List[DatabaseObject]: def general_search(self, search_query: str) -> List[DataObject]:
search_query = search_query.strip() search_query = search_query.strip()
urlescaped_query: str = quote(search_query.strip().replace(" ", "+")) urlescaped_query: str = quote(search_query.strip().replace(" ", "+"))
@@ -401,7 +414,7 @@ class YoutubeMusic(SuperYouTube):
return results return results
def fetch_artist(self, source: Source, stop_at_level: int = 1) -> Artist: def fetch_artist(self, source: Source, stop_at_level: int = 1) -> Artist:
artist = Artist() artist = Artist(source_list=[source])
# construct the request # construct the request
url = urlparse(source.url) url = urlparse(source.url)
@@ -421,6 +434,19 @@ class YoutubeMusic(SuperYouTube):
if DEBUG: if DEBUG:
dump_to_file(f"{browse_id}.json", r.text, is_json=True, exit_after_dump=False) dump_to_file(f"{browse_id}.json", r.text, is_json=True, exit_after_dump=False)
# artist details
data: dict = r.json()
header = data.get("header", {})
musicDetailHeaderRenderer = header.get("musicDetailHeaderRenderer", {})
title_runs: List[dict] = musicDetailHeaderRenderer.get("title", {}).get("runs", [])
subtitle_runs: List[dict] = musicDetailHeaderRenderer.get("subtitle", {}).get("runs", [])
if len(title_runs) > 0:
artist.name = title_runs[0].get("text", artist.name)
# fetch discography
renderer_list = r.json().get("contents", {}).get("singleColumnBrowseResultsRenderer", {}).get("tabs", [{}])[ renderer_list = r.json().get("contents", {}).get("singleColumnBrowseResultsRenderer", {}).get("tabs", [{}])[
0].get("tabRenderer", {}).get("content", {}).get("sectionListRenderer", {}).get("contents", []) 0].get("tabRenderer", {}).get("content", {}).get("sectionListRenderer", {}).get("contents", [])
@@ -465,6 +491,46 @@ class YoutubeMusic(SuperYouTube):
if DEBUG: if DEBUG:
dump_to_file(f"{browse_id}.json", r.text, is_json=True, exit_after_dump=False) dump_to_file(f"{browse_id}.json", r.text, is_json=True, exit_after_dump=False)
data = r.json()
# album details
header = data.get("header", {})
musicDetailHeaderRenderer = header.get("musicDetailHeaderRenderer", {})
title_runs: List[dict] = musicDetailHeaderRenderer.get("title", {}).get("runs", [])
subtitle_runs: List[dict] = musicDetailHeaderRenderer.get("subtitle", {}).get("runs", [])
if len(title_runs) > 0:
album.title = title_runs[0].get("text", album.title)
def other_parse_run(run: dict) -> str:
nonlocal album
if "text" not in run:
return
text = run["text"]
is_text_field = len(run.keys()) == 1
# regex that text is a year
if is_text_field and re.match(r"\d{4}", text):
album.date = ID3Timestamp.strptime(text, "%Y")
return
if text in ALBUM_TYPE_MAP:
album.album_type = ALBUM_TYPE_MAP[text]
return
if not is_text_field:
r = parse_run_element(run)
if r is not None:
album.add_list_of_other_objects([r])
return
for _run in subtitle_runs:
other_parse_run(_run)
# tracklist
renderer_list = r.json().get("contents", {}).get("singleColumnBrowseResultsRenderer", {}).get("tabs", [{}])[ renderer_list = r.json().get("contents", {}).get("singleColumnBrowseResultsRenderer", {}).get("tabs", [{}])[
0].get("tabRenderer", {}).get("content", {}).get("sectionListRenderer", {}).get("contents", []) 0].get("tabRenderer", {}).get("content", {}).get("sectionListRenderer", {}).get("contents", [])
@@ -472,20 +538,75 @@ class YoutubeMusic(SuperYouTube):
for i, content in enumerate(renderer_list): for i, content in enumerate(renderer_list):
dump_to_file(f"{i}-album-renderer.json", json.dumps(content), is_json=True, exit_after_dump=False) dump_to_file(f"{i}-album-renderer.json", json.dumps(content), is_json=True, exit_after_dump=False)
results = []
"""
cant use fixed indices, because if something has no entries, the list dissappears
instead I have to try parse everything, and just reject community playlists and profiles.
"""
for renderer in renderer_list: for renderer in renderer_list:
results.extend(parse_renderer(renderer)) album.add_list_of_other_objects(parse_renderer(renderer))
album.add_list_of_other_objects(results) for song in album.song_collection:
for song_source in song.source_collection:
song_source.additional_data["playlist_id"] = browse_id
return album return album
def fetch_lyrics(self, video_id: str, playlist_id: str = None) -> str:
"""
1. fetches the tabs of a song, to get the browse id
2. finds the browse id of the lyrics
3. fetches the lyrics with the browse id
"""
request_data = {
"context": {**self.credentials.context, "adSignalsInfo": {"params": []}},
"videoId": video_id,
}
if playlist_id is not None:
request_data["playlistId"] = playlist_id
tab_request = self.yt_music_connection.post(
url=get_youtube_url(path="/youtubei/v1/next", query=f"prettyPrint=false"),
json=request_data,
name=f"fetch_song_tabs_{video_id}.json",
)
if tab_request is None:
return None
dump_to_file(f"fetch_song_tabs_{video_id}.json", tab_request.text, is_json=True, exit_after_dump=False)
tab_data: dict = tab_request.json()
tabs = traverse_json_path(tab_data, "contents.singleColumnMusicWatchNextResultsRenderer.tabbedRenderer.watchNextTabbedResultsRenderer.tabs", default=[])
browse_id = None
for tab in tabs:
pageType = traverse_json_path(tab, "tabRenderer.endpoint.browseEndpoint.browseEndpointContextSupportedConfigs.browseEndpointContextMusicConfig.pageType", default="")
if pageType in ("MUSIC_TAB_TYPE_LYRICS", "MUSIC_PAGE_TYPE_TRACK_LYRICS") or "lyrics" in pageType.lower():
browse_id = traverse_json_path(tab, "tabRenderer.endpoint.browseEndpoint.browseId", default=None)
if browse_id is not None:
break
if browse_id is None:
return None
r = self.yt_music_connection.post(
url=get_youtube_url(path="/youtubei/v1/browse", query=f"prettyPrint=false"),
json={
"browseId": browse_id,
"context": {**self.credentials.context, "adSignalsInfo": {"params": []}}
},
name=f"fetch_song_lyrics_{video_id}.json"
)
if r is None:
return None
dump_to_file(f"fetch_song_lyrics_{video_id}.json", r.text, is_json=True, exit_after_dump=False)
data = r.json()
lyrics_text = traverse_json_path(data, "contents.sectionListRenderer.contents[0].musicDescriptionShelfRenderer.description.runs[0].text", default=None)
if lyrics_text is None:
return None
return Lyrics(FormattedText(plain=lyrics_text))
def fetch_song(self, source: Source, stop_at_level: int = 1) -> Song: def fetch_song(self, source: Source, stop_at_level: int = 1) -> Song:
ydl_res: dict = {} ydl_res: dict = {}
@@ -498,7 +619,19 @@ class YoutubeMusic(SuperYouTube):
self.fetch_media_url(source=source, ydl_res=ydl_res) self.fetch_media_url(source=source, ydl_res=ydl_res)
artist_name = ydl_res.get("artist", ydl_res.get("uploader", "")).rstrip(" - Topic") artist_names = []
uploader = ydl_res.get("uploader", "")
if uploader.endswith(" - Topic"):
artist_names = [uploader.rstrip(" - Topic")]
artist_list = [
Artist(
name=name,
source_list=[Source(
self.SOURCE_TYPE,
f"https://music.youtube.com/channel/{ydl_res.get('channel_id', ydl_res.get('uploader_id', ''))}"
)]
) for name in artist_names]
album_list = [] album_list = []
if "album" in ydl_res: if "album" in ydl_res:
@@ -507,25 +640,57 @@ class YoutubeMusic(SuperYouTube):
date=ID3Timestamp.strptime(ydl_res.get("upload_date"), "%Y%m%d"), date=ID3Timestamp.strptime(ydl_res.get("upload_date"), "%Y%m%d"),
)) ))
return Song( artist_name = artist_names[0] if len(artist_names) > 0 else None
song = Song(
title=ydl_res.get("track", clean_song_title(ydl_res.get("title"), artist_name=artist_name)), title=ydl_res.get("track", clean_song_title(ydl_res.get("title"), artist_name=artist_name)),
note=ydl_res.get("descriptions"), note=ydl_res.get("descriptions"),
album_list=album_list, album_list=album_list,
length=int(ydl_res.get("duration", 0)) * 1000, length=int(ydl_res.get("duration", 0)) * 1000,
artwork=Artwork(*ydl_res.get("thumbnails", [])), artwork=Artwork(*ydl_res.get("thumbnails", [])),
main_artist_list=[Artist( artist_list=artist_list,
name=artist_name,
source_list=[Source(
SourcePages.YOUTUBE_MUSIC,
f"https://music.youtube.com/channel/{ydl_res.get('channel_id', ydl_res.get('uploader_id', ''))}"
)]
)],
source_list=[Source( source_list=[Source(
SourcePages.YOUTUBE_MUSIC, self.SOURCE_TYPE,
f"https://music.youtube.com/watch?v={ydl_res.get('id')}" f"https://music.youtube.com/watch?v={ydl_res.get('id')}"
), source], ), source],
) )
# other song details
parsed_url = urlparse(source.url)
browse_id = parse_qs(parsed_url.query)['v'][0]
request_data = {
"captionParams": {},
"context": {**self.credentials.context, "adSignalsInfo": {"params": []}},
"videoId": browse_id,
}
if "playlist_id" in source.additional_data:
request_data["playlistId"] = source.additional_data["playlist_id"]
initial_details = self.yt_music_connection.post(
url=get_youtube_url(path="/youtubei/v1/player", query=f"prettyPrint=false"),
json=request_data,
name=f"fetch_song_{browse_id}.json",
)
if initial_details is None:
return song
dump_to_file(f"fetch_song_{browse_id}.json", initial_details.text, is_json=True, exit_after_dump=False)
data = initial_details.json()
video_details = data.get("videoDetails", {})
browse_id = video_details.get("videoId", browse_id)
song.title = video_details.get("title", song.title)
if video_details.get("isLiveContent", False):
for album in song.album_list:
album.album_type = AlbumType.LIVE_ALBUM
for thumbnail in video_details.get("thumbnails", []):
song.artwork.append(**thumbnail)
song.lyrics_collection.append(self.fetch_lyrics(browse_id, playlist_id=request_data.get("playlistId")))
return song
def fetch_media_url(self, source: Source, ydl_res: dict = None) -> dict: def fetch_media_url(self, source: Source, ydl_res: dict = None) -> dict:
def _get_best_format(format_list: List[Dict]) -> dict: def _get_best_format(format_list: List[Dict]) -> dict:
@@ -562,7 +727,6 @@ class YoutubeMusic(SuperYouTube):
self.download_values_by_url[source.url] = { self.download_values_by_url[source.url] = {
"url": _best_format.get("url"), "url": _best_format.get("url"),
"chunk_size": _best_format.get("downloader_options", {}).get("http_chunk_size", main_settings["chunk_size"]),
"headers": _best_format.get("http_headers", {}), "headers": _best_format.get("http_headers", {}),
} }
@@ -581,8 +745,9 @@ class YoutubeMusic(SuperYouTube):
raw_headers=True, raw_headers=True,
disable_cache=True, disable_cache=True,
headers=media.get("headers", {}), headers=media.get("headers", {}),
# chunk_size=media.get("chunk_size", main_settings["chunk_size"]), chunk_size=main_settings["chunk_size"],
method="GET", method="GET",
timeout=5,
) )
else: else:
result = DownloadResult(error_message=str(media.get("error") or self.not_download[source.hash_url])) result = DownloadResult(error_message=str(media.get("error") or self.not_download[source.hash_url]))

View File

@@ -3,24 +3,35 @@ from pathlib import Path
import json import json
import logging import logging
import inspect import inspect
from typing import List, Union
from .shared import DEBUG, DEBUG_LOGGING, DEBUG_DUMP, DEBUG_TRACE, DEBUG_OBJECT_TRACE, DEBUG_OBJECT_TRACE_CALLSTACK from .shared import DEBUG, DEBUG_LOGGING, DEBUG_DUMP, DEBUG_TRACE, DEBUG_OBJECT_TRACE, DEBUG_OBJECT_TRACE_CALLSTACK
from .config import config, read_config, write_config from .config import config, read_config, write_config
from .enums.colors import BColors from .enums.colors import BColors
from .path_manager import LOCATIONS from .path_manager import LOCATIONS
from .hacking import merge_args
""" """
IO functions IO functions
""" """
def _apply_color(msg: str, color: BColors) -> str: def _apply_color(msg: str, color: BColors) -> str:
if not isinstance(msg, str):
msg = str(msg)
endc = BColors.ENDC.value
if color is BColors.ENDC: if color is BColors.ENDC:
return msg return msg
msg = msg.replace(BColors.ENDC.value, BColors.ENDC.value + color.value)
return color.value + msg + BColors.ENDC.value return color.value + msg + BColors.ENDC.value
def output(msg: str, color: BColors = BColors.ENDC): @merge_args(print)
print(_apply_color(msg, color)) def output(*msg: List[str], color: BColors = BColors.ENDC, **kwargs):
print(*(_apply_color(s, color) for s in msg), **kwargs)
def user_input(msg: str, color: BColors = BColors.ENDC): def user_input(msg: str, color: BColors = BColors.ENDC):
@@ -71,6 +82,43 @@ def object_trace(obj):
misc functions misc functions
""" """
def traverse_json_path(data, path: Union[str, List[str]], default=None):
"""
Path parts are concatenated with . or wrapped with [""] for object keys and wrapped in [] for array indices.
"""
if isinstance(path, str):
path = path.replace('["', '.').replace('"]', '.').replace("[", ".").replace("]", ".")
path = [p for p in path.split(".") if len(p) > 0]
if len(path) <= 0:
return data
current = path[0]
path = path[1:]
new_data = None
if isinstance(data, dict):
new_data = data.get(current)
elif isinstance(data, list):
try:
new_data = data[int(current)]
except (IndexError, ValueError):
pass
if new_data is None:
return default
return traverse_json_path(data=new_data, path=path, default=default)
_auto_increment = 0
def generate_id() -> int:
global _auto_increment
_auto_increment += 1
return _auto_increment
def get_current_millis() -> int: def get_current_millis() -> int:
dt = datetime.now() dt = datetime.now()
return int(dt.microsecond / 1_000) return int(dt.microsecond / 1_000)

View File

@@ -59,6 +59,11 @@ Reference for the logging formats: https://docs.python.org/3/library/logging.htm
description="The logger for the musify scraper.", description="The logger for the musify scraper.",
default_value="musify" default_value="musify"
), ),
LoggerAttribute(
name="musicbrainz_logger",
description="The logger for the musicbrainz scraper.",
default_value="musicbrainz"
),
LoggerAttribute( LoggerAttribute(
name="youtube_logger", name="youtube_logger",
description="The logger for the youtube scraper.", description="The logger for the youtube scraper.",

View File

@@ -19,7 +19,7 @@ config = Config((
You can use Audio formats which support ID3.2 and ID3.1, You can use Audio formats which support ID3.2 and ID3.1,
but you will have cleaner Metadata using ID3.2."""), but you will have cleaner Metadata using ID3.2."""),
Attribute(name="result_history", default_value=False, description="""If enabled, you can go back to the previous results. Attribute(name="result_history", default_value=True, description="""If enabled, you can go back to the previous results.
The consequence is a higher meory consumption, because every result is saved."""), The consequence is a higher meory consumption, because every result is saved."""),
Attribute(name="history_length", default_value=8, description="""You can choose how far back you can go in the result history. Attribute(name="history_length", default_value=8, description="""You can choose how far back you can go in the result history.
The further you choose to be able to go back, the higher the memory usage. The further you choose to be able to go back, the higher the memory usage.

View File

@@ -1 +1,54 @@
from .source import SourcePages from __future__ import annotations
from dataclasses import dataclass
from typing import Optional, TYPE_CHECKING, Type
if TYPE_CHECKING:
from ...pages.abstract import Page
@dataclass
class SourceType:
name: str
homepage: Optional[str] = None
download_priority: int = 0
page_type: Type[Page] = None
page: Page = None
def register_page(self, page: Page):
self.page = page
def __hash__(self):
return hash(self.name)
@property
def has_page(self) -> bool:
return self.page is not None
# for backwards compatibility
@property
def value(self) -> str:
return self.name
class ALL_SOURCE_TYPES:
YOUTUBE = SourceType(name="youtube", homepage="https://music.youtube.com/")
BANDCAMP = SourceType(name="bandcamp", homepage="https://bandcamp.com/", download_priority=10)
MUSIFY = SourceType(name="musify", homepage="https://musify.club/", download_priority=7)
GENIUS = SourceType(name="genius", homepage="https://genius.com/")
MUSICBRAINZ = SourceType(name="musicbrainz", homepage="https://musicbrainz.org/")
ENCYCLOPAEDIA_METALLUM = SourceType(name="encyclopaedia metallum")
DEEZER = SourceType(name="deezer", homepage="https://www.deezer.com/")
SPOTIFY = SourceType(name="spotify", homepage="https://open.spotify.com/")
# This has nothing to do with audio, but bands can be here
WIKIPEDIA = SourceType(name="wikipedia", homepage="https://en.wikipedia.org/wiki/Main_Page")
INSTAGRAM = SourceType(name="instagram", homepage="https://www.instagram.com/")
FACEBOOK = SourceType(name="facebook", homepage="https://www.facebook.com/")
TWITTER = SourceType(name="twitter", homepage="https://twitter.com/")
# Yes somehow this ancient site is linked EVERYWHERE
MYSPACE = SourceType(name="myspace", homepage="https://myspace.com/")
MANUAL = SourceType(name="manual")
PRESET = SourceType(name="preset")

View File

@@ -1,50 +0,0 @@
from enum import Enum
class SourceTypes(Enum):
SONG = "song"
ALBUM = "album"
ARTIST = "artist"
LYRICS = "lyrics"
class SourcePages(Enum):
YOUTUBE = "youtube"
MUSIFY = "musify"
YOUTUBE_MUSIC = "youtube music"
GENIUS = "genius"
MUSICBRAINZ = "musicbrainz"
ENCYCLOPAEDIA_METALLUM = "encyclopaedia metallum"
BANDCAMP = "bandcamp"
DEEZER = "deezer"
SPOTIFY = "spotify"
# This has nothing to do with audio, but bands can be here
WIKIPEDIA = "wikipedia"
INSTAGRAM = "instagram"
FACEBOOK = "facebook"
TWITTER = "twitter" # I will use nitter though lol
MYSPACE = "myspace" # Yes somehow this ancient site is linked EVERYWHERE
MANUAL = "manual"
PRESET = "preset"
@classmethod
def get_homepage(cls, attribute) -> str:
homepage_map = {
cls.YOUTUBE: "https://www.youtube.com/",
cls.MUSIFY: "https://musify.club/",
cls.MUSICBRAINZ: "https://musicbrainz.org/",
cls.ENCYCLOPAEDIA_METALLUM: "https://www.metal-archives.com/",
cls.GENIUS: "https://genius.com/",
cls.BANDCAMP: "https://bandcamp.com/",
cls.DEEZER: "https://www.deezer.com/",
cls.INSTAGRAM: "https://www.instagram.com/",
cls.FACEBOOK: "https://www.facebook.com/",
cls.SPOTIFY: "https://open.spotify.com/",
cls.TWITTER: "https://twitter.com/",
cls.MYSPACE: "https://myspace.com/",
cls.WIKIPEDIA: "https://en.wikipedia.org/wiki/Main_Page"
}
return homepage_map[attribute]

View File

@@ -1 +1,23 @@
__all__ = ["config"] class MKBaseException(Exception):
def __init__(self, message: str = None, **kwargs) -> None:
self.message = message
super().__init__(message, **kwargs)
# Downloading
class MKDownloadException(MKBaseException):
pass
class MKMissingNameException(MKDownloadException):
pass
# Frontend
class MKFrontendException(MKBaseException):
pass
class MKInvalidInputException(MKFrontendException):
pass

View File

@@ -78,7 +78,14 @@ def _merge(
drop_args = [] drop_args = []
if drop_kwonlyargs is None: if drop_kwonlyargs is None:
drop_kwonlyargs = [] drop_kwonlyargs = []
source_spec = inspect.getfullargspec(source)
is_builtin = False
try:
source_spec = inspect.getfullargspec(source)
except TypeError:
is_builtin = True
source_spec = inspect.FullArgSpec(type(source).__name__, [], [], [], [], [], [])
dest_spec = inspect.getfullargspec(dest) dest_spec = inspect.getfullargspec(dest)
if source_spec.varargs or source_spec.varkw: if source_spec.varargs or source_spec.varkw:
@@ -128,13 +135,15 @@ def _merge(
'co_kwonlyargcount': len(kwonlyargs_merged), 'co_kwonlyargcount': len(kwonlyargs_merged),
'co_posonlyargcount': dest.__code__.co_posonlyargcount, 'co_posonlyargcount': dest.__code__.co_posonlyargcount,
'co_nlocals': len(args_all), 'co_nlocals': len(args_all),
'co_flags': source.__code__.co_flags,
'co_varnames': args_all, 'co_varnames': args_all,
'co_filename': dest.__code__.co_filename, 'co_filename': dest.__code__.co_filename,
'co_name': dest.__code__.co_name, 'co_name': dest.__code__.co_name,
'co_firstlineno': dest.__code__.co_firstlineno, 'co_firstlineno': dest.__code__.co_firstlineno,
} }
if hasattr(source, "__code__"):
replace_kwargs['co_flags'] = source.__code__.co_flags
if PY310: if PY310:
replace_kwargs['co_linetable'] = dest.__code__.co_linetable replace_kwargs['co_linetable'] = dest.__code__.co_linetable
else: else:
@@ -151,7 +160,7 @@ def _merge(
len(kwonlyargs_merged), len(kwonlyargs_merged),
_blank.__code__.co_nlocals, _blank.__code__.co_nlocals,
_blank.__code__.co_stacksize, _blank.__code__.co_stacksize,
source.__code__.co_flags, source.__code__.co_flags if hasattr(source, "__code__") else dest.__code__.co_flags,
_blank.__code__.co_code, (), (), _blank.__code__.co_code, (), (),
args_all, dest.__code__.co_filename, args_all, dest.__code__.co_filename,
dest.__code__.co_name, dest.__code__.co_name,
@@ -171,6 +180,9 @@ def _merge(
dest_ret = dest.__annotations__['return'] dest_ret = dest.__annotations__['return']
for v in ('__kwdefaults__', '__annotations__'): for v in ('__kwdefaults__', '__annotations__'):
if not hasattr(source, v):
continue
out = getattr(source, v) out = getattr(source, v)
if out is None: if out is None:
out = {} out = {}

View File

@@ -19,7 +19,8 @@ DEBUG_OBJECT_TRACE = DEBUG and False
DEBUG_OBJECT_TRACE_CALLSTACK = DEBUG_OBJECT_TRACE and False DEBUG_OBJECT_TRACE_CALLSTACK = DEBUG_OBJECT_TRACE and False
DEBUG_YOUTUBE_INITIALIZING = DEBUG and False DEBUG_YOUTUBE_INITIALIZING = DEBUG and False
DEBUG_PAGES = DEBUG and False DEBUG_PAGES = DEBUG and False
DEBUG_DUMP = DEBUG and False DEBUG_DUMP = DEBUG and True
DEBUG_PRINT_ID = DEBUG and True
if DEBUG: if DEBUG:
print("DEBUG ACTIVE") print("DEBUG ACTIVE")

View File

@@ -6,6 +6,7 @@ from functools import lru_cache
from transliterate.exceptions import LanguageDetectionError from transliterate.exceptions import LanguageDetectionError
from transliterate import translit from transliterate import translit
from pathvalidate import sanitize_filename from pathvalidate import sanitize_filename
from urllib.parse import urlparse, ParseResult, parse_qs
COMMON_TITLE_APPENDIX_LIST: Tuple[str, ...] = ( COMMON_TITLE_APPENDIX_LIST: Tuple[str, ...] = (
@@ -21,6 +22,7 @@ def unify(string: str) -> str:
returns a unified str, to make comparisons easy. returns a unified str, to make comparisons easy.
a unified string has the following attributes: a unified string has the following attributes:
- is lowercase - is lowercase
- is transliterated to Latin characters from e.g. Cyrillic
""" """
if string is None: if string is None:
@@ -31,7 +33,8 @@ def unify(string: str) -> str:
except LanguageDetectionError: except LanguageDetectionError:
pass pass
return string.lower() string = unify_punctuation(string)
return string.lower().strip()
def fit_to_file_system(string: Union[str, Path], hidden_ok: bool = False) -> Union[str, Path]: def fit_to_file_system(string: Union[str, Path], hidden_ok: bool = False) -> Union[str, Path]:
@@ -49,7 +52,14 @@ def fit_to_file_system(string: Union[str, Path], hidden_ok: bool = False) -> Uni
string = string[1:] string = string[1:]
string = string.replace("/", "_").replace("\\", "_") string = string.replace("/", "_").replace("\\", "_")
try:
string = translit(string, reversed=True)
except LanguageDetectionError:
pass
string = sanitize_filename(string) string = sanitize_filename(string)
return string return string
if isinstance(string, Path): if isinstance(string, Path):
@@ -106,10 +116,13 @@ def clean_song_title(raw_song_title: str, artist_name: Optional[str] = None) ->
# Remove artist from the start of the title # Remove artist from the start of the title
if raw_song_title.lower().startswith(artist_name.lower()): if raw_song_title.lower().startswith(artist_name.lower()):
raw_song_title = raw_song_title[len(artist_name):].strip()
if raw_song_title.startswith("-"): possible_new_name = raw_song_title[len(artist_name):].strip()
raw_song_title = raw_song_title[1:].strip()
for char in ("-", "", ":", "|"):
if possible_new_name.startswith(char):
raw_song_title = possible_new_name[1:].strip()
break
return raw_song_title.strip() return raw_song_title.strip()
@@ -127,13 +140,45 @@ UNIFY_TO = " "
ALLOWED_LENGTH_DISTANCE = 20 ALLOWED_LENGTH_DISTANCE = 20
def unify_punctuation(to_unify: str) -> str: def unify_punctuation(to_unify: str, unify_to: str = UNIFY_TO) -> str:
for char in string.punctuation: for char in string.punctuation:
to_unify = to_unify.replace(char, UNIFY_TO) to_unify = to_unify.replace(char, unify_to)
return to_unify return to_unify
def hash_url(url: str) -> int: @lru_cache(maxsize=128)
return url.strip().lower().lstrip("https://").lstrip("http://") def hash_url(url: Union[str, ParseResult]) -> str:
if isinstance(url, str):
url = urlparse(url)
unify_to = "-"
def unify_part(part: str) -> str:
nonlocal unify_to
return unify_punctuation(part.lower(), unify_to=unify_to).strip(unify_to)
# netloc
netloc = unify_part(url.netloc)
if netloc.startswith("www" + unify_to):
netloc = netloc[3 + len(unify_to):]
# query
query = url.query
query_dict: Optional[dict] = None
try:
query_dict: dict = parse_qs(url.query, strict_parsing=True)
except ValueError:
# the query couldn't be parsed
pass
if isinstance(query_dict, dict):
# sort keys alphabetically
query = ""
for key, value in sorted(query_dict.items(), key=lambda i: i[0]):
query += f"{key.strip()}-{''.join(i.strip() for i in value)}"
r = f"{netloc}_{unify_part(url.path)}_{unify_part(query)}"
r = r.lower().strip()
return r
def remove_feature_part_from_track(title: str) -> str: def remove_feature_part_from_track(title: str) -> str:

View File

@@ -24,7 +24,7 @@ class Query:
return [self.music_object.name] return [self.music_object.name]
if isinstance(self.music_object, Song): if isinstance(self.music_object, Song):
return [f"{artist.name} - {self.music_object}" for artist in self.music_object.main_artist_collection] return [f"{artist.name} - {self.music_object}" for artist in self.music_object.artist_collection]
if isinstance(self.music_object, Album): if isinstance(self.music_object, Album):
return [f"{artist.name} - {self.music_object}" for artist in self.music_object.artist_collection] return [f"{artist.name} - {self.music_object}" for artist in self.music_object.artist_collection]

View File

@@ -69,7 +69,7 @@ dependencies = [
"toml~=0.10.2", "toml~=0.10.2",
"typing_extensions~=4.7.1", "typing_extensions~=4.7.1",
"python-sponsorblock~=0.0.0", "python-sponsorblock~=0.1",
"youtube_dl", "youtube_dl",
] ]
dynamic = [ dynamic = [

0
tests/__init__.py Normal file
View File

View File

@@ -3,96 +3,98 @@ import unittest
from music_kraken.objects import Song, Album, Artist, Collection, Country from music_kraken.objects import Song, Album, Artist, Collection, Country
class TestCollection(unittest.TestCase): class TestCollection(unittest.TestCase):
@staticmethod def test_song_contains_album(self):
def complicated_object() -> Artist: """
return Artist( Tests that every song contains the album it is added to in its album_collection
name="artist", """
country=Country.by_alpha_2("DE"),
main_album_list=[ a_1 = Album(
Album( title="album",
title="album", song_list= [
song_list=[ Song(title="song"),
Song(
title="song",
album_list=[
Album(title="album", albumsort=123),
],
),
Song(
title="other_song",
album_list=[
Album(title="album", albumsort=423),
],
),
]
),
Album(title="album", barcode="1234567890123"),
] ]
) )
a_2 = a_1.song_collection[0].album_collection[0]
self.assertTrue(a_1.id == a_2.id)
def test_song_album_relation(self): def test_album_contains_song(self):
""" """
Tests that Tests that every album contains the song it is added to in its song_collection
album = album.any_song.one_album """
is the same object s_1 = Song(
title="song",
album_list=[
Album(title="album"),
]
)
s_2 = s_1.album_collection[0].song_collection[0]
self.assertTrue(s_1.id == s_2.id)
def test_auto_add_artist_to_album_feature_artist(self):
"""
Tests that every artist is added to the album's feature_artist_collection per default
""" """
a = self.complicated_object().main_album_collection[0] a_1 = Artist(
b = a.song_collection[0].album_collection[0]
c = a.song_collection[1].album_collection[0]
d = b.song_collection[0].album_collection[0]
e = d.song_collection[0].album_collection[0]
f = e.song_collection[0].album_collection[0]
g = f.song_collection[0].album_collection[0]
self.assertTrue(a.id == b.id == c.id == d.id == e.id == f.id == g.id)
self.assertTrue(a.title == b.title == c.title == d.title == e.title == f.title == g.title == "album")
self.assertTrue(a.barcode == b.barcode == c.barcode == d.barcode == e.barcode == f.barcode == g.barcode == "1234567890123")
self.assertTrue(a.albumsort == b.albumsort == c.albumsort == d.albumsort == e.albumsort == f.albumsort == g.albumsort == 123)
d.title = "new_title"
self.assertTrue(a.title == b.title == c.title == d.title == e.title == f.title == g.title == "new_title")
def test_album_artist_relation(self):
"""
Tests that
artist = artist.any_album.any_song.one_artist
is the same object
"""
a = self.complicated_object()
b = a.main_album_collection[0].artist_collection[0]
c = b.main_album_collection[0].artist_collection[0]
d = c.main_album_collection[0].artist_collection[0]
self.assertTrue(a.id == b.id == c.id == d.id)
self.assertTrue(a.name == b.name == c.name == d.name == "artist")
self.assertTrue(a.country == b.country == c.country == d.country)
def test_artist_artist_relation(self):
artist = Artist(
name="artist", name="artist",
main_album_list=[ album_list=[
Album(title="album")
]
)
a_2 = a_1.album_collection[0].feature_artist_collection[0]
self.assertTrue(a_1.id == a_2.id)
def test_auto_add_artist_to_album_feature_artist_push(self):
"""
Tests that every artist is added to the album's feature_artist_collection per default but pulled into the album's artist_collection if a merge exitst
"""
a_1 = Artist(
name="artist",
album_list=[
Album( Album(
title="album", title="album",
song_list=[
Song(title="song"),
],
artist_list=[ artist_list=[
Artist(name="artist"), Artist(name="artist"),
] ]
) )
] ]
) )
a_2 = a_1.album_collection[0].artist_collection[0]
self.assertTrue(artist.id == artist.main_album_collection[0].song_collection[0].main_artist_collection[0].id) self.assertTrue(a_1.id == a_2.id)
def test_artist_artist_relation(self):
"""
Tests the proper syncing between album.artist_collection and song.artist_collection
"""
album = Album(
title="album",
song_list=[
Song(title="song"),
],
artist_list=[
Artist(name="artist"),
]
)
a_1 = album.artist_collection[0]
a_2 = album.song_collection[0].artist_collection[0]
self.assertTrue(a_1.id == a_2.id)
def test_artist_collection_sync(self): def test_artist_collection_sync(self):
"""
tests the actual implementation of the test above
"""
album_1 = Album( album_1 = Album(
title="album", title="album",
song_list=[ song_list=[
Song(title="song", main_artist_list=[Artist(name="artist")]), Song(title="song", artist_list=[Artist(name="artist")]),
], ],
artist_list=[ artist_list=[
Artist(name="artist"), Artist(name="artist"),
@@ -102,7 +104,7 @@ class TestCollection(unittest.TestCase):
album_2 = Album( album_2 = Album(
title="album", title="album",
song_list=[ song_list=[
Song(title="song", main_artist_list=[Artist(name="artist")]), Song(title="song", artist_list=[Artist(name="artist")]),
], ],
artist_list=[ artist_list=[
Artist(name="artist"), Artist(name="artist"),
@@ -111,17 +113,7 @@ class TestCollection(unittest.TestCase):
album_1.merge(album_2) album_1.merge(album_2)
self.assertTrue(id(album_1.artist_collection) == id(album_1.artist_collection) == id(album_1.song_collection[0].main_artist_collection) == id(album_1.song_collection[0].main_artist_collection)) self.assertTrue(id(album_1.artist_collection) == id(album_1.artist_collection) == id(album_1.song_collection[0].artist_collection) == id(album_1.song_collection[0].artist_collection))
def test_song_artist_relations(self):
a = self.complicated_object()
b = a.main_album_collection[0].song_collection[0].main_artist_collection[0]
c = b.main_album_collection[0].song_collection[0].main_artist_collection[0]
d = c.main_album_collection[0].song_collection[0].main_artist_collection[0]
self.assertTrue(a.id == b.id == c.id == d.id)
self.assertTrue(a.name == b.name == c.name == d.name == "artist")
self.assertTrue(a.country == b.country == c.country == d.country)
if __name__ == "__main__": if __name__ == "__main__":
unittest.main() unittest.main()

35
tests/test_hash_url.py Normal file
View File

@@ -0,0 +1,35 @@
import unittest
from music_kraken.utils.string_processing import hash_url
class TestCollection(unittest.TestCase):
def test_remove_schema(self):
self.assertFalse(hash_url("https://www.youtube.com/watch?v=3jZ_D3ELwOQ").startswith("https"))
self.assertFalse(hash_url("ftp://www.youtube.com/watch?v=3jZ_D3ELwOQ").startswith("https"))
self.assertFalse(hash_url("sftp://www.youtube.com/watch?v=3jZ_D3ELwOQ").startswith("https"))
self.assertFalse(hash_url("http://www.youtube.com/watch?v=3jZ_D3ELwOQ").startswith("https"))
def test_no_punctuation(self):
self.assertNotIn(hash_url("https://www.you_tube.com/watch?v=3jZ_D3ELwOQ"), "you_tube")
self.assertNotIn(hash_url("https://docs.gitea.com/next/install.ation/comparison"), ".")
def test_three_parts(self):
"""
The url is parsed into three parts [netloc; path; query]
Which are then appended to each other with an underscore between.
"""
self.assertTrue(hash_url("https://duckduckgo.com/?t=h_&q=dfasf&ia=web").count("_") == 2)
def test_sort_query(self):
"""
The query is sorted alphabetically
"""
hashed = hash_url("https://duckduckgo.com/?t=h_&q=dfasf&ia=web")
sorted_keys = ["ia-", "q-", "t-"]
self.assertTrue(hashed.index(sorted_keys[0]) < hashed.index(sorted_keys[1]) < hashed.index(sorted_keys[2]))
if __name__ == "__main__":
unittest.main()