Compare commits

..

No commits in common. "experimental" and "fix/metal_archives" have entirely different histories.

65 changed files with 2519 additions and 3074 deletions

22
.vscode/launch.json vendored
View File

@ -1,22 +0,0 @@
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "Python Debugger: Current File",
"type": "debugpy",
"request": "launch",
"program": "${file}",
"console": "integratedTerminal"
},
{
"name": "Python Debugger: Download script",
"type": "debugpy",
"request": "launch",
"program": "development/actual_donwload.py",
"console": "integratedTerminal"
}
]
}

18
.vscode/settings.json vendored
View File

@ -16,33 +16,17 @@
},
"python.formatting.provider": "none",
"cSpell.words": [
"albumsort",
"APIC",
"Bandcamp",
"bitrate",
"CALLSTACK",
"DEEZER",
"dotenv",
"encyclopaedia",
"ENDC",
"Gitea",
"iframe",
"isrc",
"itemprop",
"levenshtein",
"metallum",
"MUSICBRAINZ",
"musify",
"OKBLUE",
"OKGREEN",
"pathvalidate",
"Referer",
"sponsorblock",
"tracklist",
"tracksort",
"translit",
"unmap",
"youtube",
"youtubei"
"youtube"
]
}

View File

@ -11,6 +11,7 @@ steps:
build-stable:
image: python
commands:
- sed -i 's/name = "music-kraken"/name = "music-kraken-stable"/' pyproject.toml
- python -m pip install -r requirements-dev.txt
- python3 -m build
environment:

226
README.md
View File

@ -2,43 +2,61 @@
[![Woodpecker CI Status](https://ci.elara.ws/api/badges/59/status.svg)](https://ci.elara.ws/repos/59)
<img src="https://gitea.elara.ws/music-kraken/music-kraken-core/media/branch/experimental/assets/logo.svg" width=300 alt="music kraken logo"/>
<img src="assets/logo.svg" width=300 alt="music kraken logo"/>
- [Installation](#installation)
- [Quick-Guide](#quick-guide)
- [How to search properly](#query)
- [Matrix Space](#matrix-space)
If you want to use this a library or contribute, check out [the wiki](https://gitea.elara.ws/music-kraken/music-kraken-core/wiki) for more information.
- [Music Kraken](#music-kraken)
- [Installation](#installation)
- [From source](#from-source)
- [Notes for WSL](#notes-for-wsl)
- [Quick-Guide](#quick-guide)
- [Query](#query)
- [CONTRIBUTE](#contribute)
- [Matrix Space](#matrix-space)
- [TODO till the next release](#todo-till-the-next-release)
- [Programming Interface / Use as Library](#programming-interface--use-as-library)
- [Quick Overview](#quick-overview)
- [Data Model](#data-model)
- [Data Objects](#data-objects)
- [Creation](#creation)
---
## Installation
You can find and get this project from either [PyPI](https://pypi.org/project/music-kraken/) as a Python-Package,
or simply the source code from [Gitea](https://gitea.elara.ws/music-kraken/music-kraken-core). **
or simply the source code from [GitHub](https://github.com/HeIIow2/music-downloader). Note that even though
everything **SHOULD** work cross-platform, I have only tested it on Ubuntu.
If you enjoy this project, feel free to give it a star on GitHub.
**NOTES**
- Even though everything **SHOULD** work cross-platform, I have only tested it on Ubuntu.
- If you enjoy this project, feel free to give it a star on GitHub.
> THE PyPI PACKAGE IS OUTDATED
### From source
if you use Debian or Ubuntu:
```sh
git clone https://gitea.elara.ws/music-kraken/music-kraken-core.git
python3 -m pip install -e music-kraken-core/
git clone https://github.com/HeIIow2/music-downloader
sudo apt install pandoc
cd music-downloader/
python3 -m pip install -r requirements.txt
```
To update the program, if installed like this, go into the `music-kraken-core` directory and run `git pull`.
then you can add to `~/.bashrc`
### Get it running on other Systems
```
alias music-kraken='cd your/directory/music-downloader/src; python3 -m music_kraken'
alias 🥺='sudo'
```
Here are the collected issues, that are related to running the program on different systems. If you have any issues, feel free to open a new one.
```sh
source ~/.bashrc
music-kraken
```
#### Windows + WSL
### Notes for WSL
Add ` ~/.local/bin` to your `$PATH`. [#2][i2]
If you choose to run it in WSL, make sure ` ~/.local/bin` is added to your `$PATH` [#2][i2]
## Quick-Guide
@ -69,6 +87,10 @@ The escape character is as usual `\`.
---
## CONTRIBUTE
I am happy about every pull request. To contribute look [here](contribute.md).
## Matrix Space
<img align="right" alt="music-kraken logo" src="assets/element_logo.png" width=100>
@ -77,5 +99,171 @@ I decided against creating a discord server, due to various communities get ofte
**Click [this invitation](https://matrix.to/#/#music-kraken:matrix.org) _([https://matrix.to/#/#music-kraken:matrix.org](https://matrix.to/#/#music-kraken:matrix.org))_ to join.**
## TODO till the next release
> These Points will most likely be in the changelogs.
- [x] Migrate away from pandoc, to a more lightweight alternative, that can be installed over PiPY.
- [ ] Update the Documentation of the internal structure. _(could be pushed back one release)_
---
# Programming Interface / Use as Library
This application is $100\%$ centered around Data. Thus, the most important thing for working with musik kraken is, to understand how I structured the data.
## Quick Overview
- explanation of the [Data Model](#data-model)
- how to use the [Data Objects](#data-objects)
- further Dokumentation of _hopefully_ [most relevant classes](documentation/objects.md)
- the [old implementation](documentation/old_implementation.md)
```mermaid
---
title: Quick Overview (outdated)
---
sequenceDiagram
participant pg as Page (eg. YouTube, MB, Musify, ...)
participant obj as DataObjects (eg. Song, Artist, ...)
participant db as DataBase
obj ->> db: write
db ->> obj: read
pg -> obj: find a source for any page, for object.
obj -> pg: add more detailed data from according page.
obj -> pg: if available download audio to target.
```
## Data Model
The Data Structure, that the whole programm is built on looks as follows:
```mermaid
---
title: Music Data
---
erDiagram
Target {
}
Lyrics {
}
Song {
}
Album {
}
Artist {
}
Label {
}
Source {
}
Source }o--|| Song : ""
Source }o--|| Lyrics : ""
Source }o--|| Album : ""
Source }o--|| Artist : ""
Source }o--|| Label : ""
Song }o--o{ Album : AlbumSong
Album }o--o{ Artist : ArtistAlbum
Song }o--o{ Artist : "ArtistSong (features)"
Label }o--o{ Album : LabelAlbum
Label }o--o{ Artist : LabelSong
Song ||--o{ Lyrics : ""
Song ||--o{ Target : ""
```
Ok now this **WILL** look intimidating, thus I break it down quickly.
*That is also the reason I didn't add all Attributes here.*
The most important Entities are:
- Song
- Album
- Artist
- Label
All of them *(and Lyrics)* can have multiple Sources, and every Source can only Point to one of those Element.
The `Target` Entity represents the location on the hard drive a Song has. One Song can have multiple download Locations.
The `Lyrics` Entity simply represents the Lyrics of each Song. One Song can have multiple Lyrics, e.g. Translations.
Here is the simplified Diagramm without only the main Entities.
```mermaid
---
title: simplified Music Data
---
erDiagram
Song {
}
Album {
}
Artist {
}
Label {
}
Song }o--o{ Album : AlbumSong
Album }o--o{ Artist : ArtistAlbum
Song }o--o{ Artist : "ArtistSong (features)"
Label }o--o{ Album : LabelAlbum
Label }o--o{ Artist : LabelSong
```
Looks way more manageable, doesn't it?
The reason every relation here is a `n:m` *(many to many)* relation is not, that it makes sense in the aspekt of modeling reality, but to be able to put data from many Sources in the same Data Model.
Every Service models Data a bit different, and projecting a one-to-many relationship to a many to many relationship without data loss is easy. The other way around it is basically impossible
## Data Objects
> Not 100% accurate yet and *might* change slightly
### Creation
```python
# needs to be added
```
If you just want to start implementing, then just use the code example I provided, I don't care.
For those who don't want any bugs and use it as intended *(which is recommended, cuz I am only one person so there are defs bugs)* continue reading, and read the whole documentation, which may exist in the future xD
[i10]: https://github.com/HeIIow2/music-downloader/issues/10
[i2]: https://github.com/HeIIow2/music-downloader/issues/2

View File

@ -0,0 +1,66 @@
DROP TABLE IF EXISTS artist;
CREATE TABLE artist (
id TEXT PRIMARY KEY NOT NULL,
name TEXT
);
DROP TABLE IF EXISTS artist_release_group;
CREATE TABLE artist_release_group (
artist_id TEXT NOT NULL,
release_group_id TEXT NOT NULL
);
DROP TABLE IF EXISTS artist_track;
CREATE TABLE artist_track (
artist_id TEXT NOT NULL,
track_id TEXT NOT NULL
);
DROP TABLE IF EXISTS release_group;
CREATE TABLE release_group (
id TEXT PRIMARY KEY NOT NULL,
albumartist TEXT,
albumsort INT,
musicbrainz_albumtype TEXT,
compilation TEXT,
album_artist_id TEXT
);
DROP TABLE IF EXISTS release_;
CREATE TABLE release_ (
id TEXT PRIMARY KEY NOT NULL,
release_group_id TEXT NOT NULL,
title TEXT,
copyright TEXT,
album_status TEXT,
language TEXT,
year TEXT,
date TEXT,
country TEXT,
barcode TEXT
);
DROP TABLE IF EXISTS track;
CREATE TABLE track (
id TEXT PRIMARY KEY NOT NULL,
downloaded BOOLEAN NOT NULL DEFAULT 0,
release_id TEXT NOT NULL,
track TEXT,
length INT,
tracknumber TEXT,
isrc TEXT,
genre TEXT,
lyrics TEXT,
path TEXT,
file TEXT,
url TEXT,
src TEXT
);
DROP TABLE IF EXISTS source;
CREATE TABLE source (
track_id TEXT NOT NULL,
src TEXT NOT NULL,
url TEXT NOT NULL,
valid BOOLEAN NOT NULL DEFAULT 1
);

View File

@ -1,15 +1,53 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg version="1.0" width="1024" height="1024" viewBox="0 0 1024.000000 1024.000000"
preserveAspectRatio="xMidYMid meet" id="svg168" sodipodi:docname="02.svg"
inkscape:version="1.2.2 (b0a8486541, 2022-12-01)" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd" xmlns="http://www.w3.org/2000/svg"
<svg
version="1.0"
width="1024.000000pt"
height="1024.000000pt"
viewBox="0 0 1024.000000 1024.000000"
preserveAspectRatio="xMidYMid meet"
id="svg168"
sodipodi:docname="02.svg"
inkscape:version="1.2.2 (b0a8486541, 2022-12-01)"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns="http://www.w3.org/2000/svg"
xmlns:svg="http://www.w3.org/2000/svg">
<defs id="defs172" />
<rect x="0" y="0" width="100%" height="100%" rx="10%" fill="#f0f0f0" id="background" />
<g transform="translate(0.000000,1024.000000) scale(0.100000,-0.100000)" fill="#000000" id="wireframe">
<defs
id="defs172" />
<sodipodi:namedview
id="namedview170"
pagecolor="#ffffff"
bordercolor="#000000"
borderopacity="0.25"
inkscape:showpageshadow="2"
inkscape:pageopacity="0.0"
inkscape:pagecheckerboard="0"
inkscape:deskcolor="#d1d1d1"
inkscape:document-units="pt"
showgrid="false"
inkscape:zoom="0.69140625"
inkscape:cx="437.51412"
inkscape:cy="984.22599"
inkscape:window-width="1866"
inkscape:window-height="1012"
inkscape:window-x="0"
inkscape:window-y="0"
inkscape:window-maximized="1"
inkscape:current-layer="g166" />
<g
transform="translate(0.000000,1024.000000) scale(0.100000,-0.100000)"
fill="#000000"
stroke="none"
id="g166">
<rect
x="10"
y="10"
width="10239.509"
height="10229.297"
rx="1503.97427"
fill="#f0f0f0"
id="rect148"
style="stroke-width:10.1935" />
<path
d="M4784 8535 c-695 -66 -1296 -270 -1819 -616 -369 -245 -627 -477 -843 -763 -304 -402 -461 -948 -479 -1666 -9 -352 13 -581 82 -850 40 -156 61 -215 117 -323 55 -105 114 -169 194 -208 61 -30 69 -32 148 -27 179 12 320 123 356 281 8 38 6 64 -15 154 -14 59 -32 140 -41 178 -8 39 -21 95 -29 125 -41 165 -50 270 -50 565 0 261 3 309 28 480 30 214 28 242 -24 293 -41 40 -146 68 -312 84 -70 6 -127 15 -127 20 0 15 102 293 139 378 79 183 209 386 348 546 129 147 379 360 588 501 124 83 234 147 242 139 3 -3 -21 -36 -54 -73 -178 -203 -321 -426 -411 -643 -110 -265 -152 -484 -153 -804 -1 -338 43 -569 166 -877 56 -138 108 -235 192 -357 83 -119 95 -148 137 -323 54 -224 163 -505 223 -574 50 -57 102 -69 147 -34 46 36 34 86 -63 252 -65 113 -88 182 -107 332 -17 133 -20 142 -164 445 -148 313 -197 440 -250 650 -42 169 -60 311 -60 480 0 575 268 1118 733 1488 260 206 635 354 1060 418 142 21 566 26 722 9 323 -36 644 -133 905 -273 180 -96 322 -205 481 -368 464 -478 615 -1159 402 -1809 -22 -66 -78 -191 -142 -315 -275 -536 -251 -481 -271 -620 -10 -69 -28 -177 -40 -240 -27 -146 -37 -342 -20 -394 15 -47 51 -64 87 -41 73 49 164 319 184 549 17 208 39 271 158 461 197 313 285 530 342 845 31 167 34 543 6 685 -82 408 -210 682 -470 1005 -47 58 -83 107 -81 109 1 2 21 -7 43 -20 22 -13 77 -46 123 -73 324 -190 683 -538 883 -856 91 -145 268 -561 247 -582 -4 -3 -60 -16 -125 -27 -175 -31 -300 -80 -364 -141 -29 -26 -29 -54 -2 -190 64 -330 65 -751 3 -1081 -8 -46 -32 -145 -51 -219 -42 -157 -47 -246 -19 -329 20 -58 68 -118 120 -151 106 -65 273 -77 372 -27 140 71 251 273 328 592 55 229 76 429 76 725 0 991 -288 1664 -949 2213 -577 481 -1339 795 -2151 887 -154 18 -537 21 -696 5z"
id="path150" />

Before

Width:  |  Height:  |  Size: 5.1 KiB

After

Width:  |  Height:  |  Size: 5.8 KiB

BIN
assets/logos/.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 302 KiB

BIN
assets/logos/00.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 30 KiB

BIN
assets/logos/01.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 292 KiB

76
assets/logos/01.svg Normal file
View File

@ -0,0 +1,76 @@
<?xml version="1.0" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 20010904//EN"
"http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd">
<svg version="1.0" xmlns="http://www.w3.org/2000/svg"
width="1024.000000pt" height="1024.000000pt" viewBox="0 0 1024.000000 1024.000000"
preserveAspectRatio="xMidYMid meet">
<g transform="translate(0.000000,1024.000000) scale(0.100000,-0.100000)"
fill="#000000" stroke="none">
<path d="M4965 7890 c-800 -37 -1523 -349 -2220 -960 -398 -349 -585 -575
-740 -895 -186 -381 -255 -705 -255 -1187 l0 -170 -31 7 c-16 4 -57 4 -90 0
l-59 -7 0 -46 c0 -26 7 -85 16 -132 32 -182 33 -172 -25 -256 -84 -120 -144
-270 -172 -427 -19 -116 -7 -352 25 -477 126 -486 561 -875 1080 -965 192 -33
458 -14 628 44 270 93 516 298 632 529 75 150 106 265 106 404 0 221 -64 380
-230 566 -256 287 -315 365 -382 509 -71 151 -81 208 -82 458 -1 207 1 226 26
322 27 103 90 244 147 327 56 80 154 168 237 212 93 49 184 72 354 87 l125 12
3 40 c5 73 -8 80 -150 78 -189 -2 -303 -33 -465 -124 -224 -126 -412 -428
-445 -713 -11 -97 -4 -324 13 -441 29 -193 116 -405 226 -552 22 -28 99 -118
173 -200 210 -233 261 -314 281 -445 24 -155 -6 -312 -94 -483 -54 -106 -197
-252 -312 -319 -177 -103 -317 -145 -515 -153 -202 -8 -362 24 -547 112 -140
66 -228 128 -339 239 -187 187 -278 396 -291 661 -11 242 60 492 169 595 32
30 101 64 176 86 42 12 62 23 61 34 0 8 -15 53 -33 100 -28 72 -37 85 -56 85
-24 0 -33 -11 -50 -57 -20 -57 -68 -20 -94 71 -9 31 -16 72 -16 89 0 32 0 32
50 32 l50 0 0 40 c0 32 5 43 29 61 l29 21 6 147 c11 255 40 516 73 645 84 337
221 619 423 869 58 72 181 197 233 237 39 31 36 11 -24 -121 -106 -239 -174
-489 -198 -729 -6 -58 -14 -124 -17 -147 l-5 -43 83 0 83 0 2 28 c6 87 48 350
69 434 102 408 382 841 719 1111 444 356 1004 562 1620 597 534 30 1027 -72
1505 -310 599 -298 964 -752 1090 -1355 25 -121 41 -502 22 -514 -7 -4 30 -5
81 -3 50 2 94 6 97 9 3 3 5 102 4 221 0 153 -6 251 -18 331 -17 110 -74 358
-95 414 -6 15 -7 27 -2 27 16 0 150 -148 223 -246 171 -231 272 -434 347 -701
66 -234 95 -428 76 -502 -7 -31 -8 -64 -2 -101 4 -30 11 -131 15 -225 l7 -170
70 -1 c135 -3 125 1 125 -41 0 -46 -25 -142 -40 -157 -6 -6 -53 -11 -113 -12
l-102 -1 -7 -49 c-4 -27 -9 -63 -13 -80 -5 -30 -5 -30 71 -54 89 -28 144 -72
205 -162 98 -143 139 -280 139 -459 0 -169 -28 -286 -105 -439 -162 -321 -435
-531 -787 -606 -126 -27 -362 -24 -493 5 -220 50 -383 138 -531 285 -108 109
-156 188 -188 312 -21 80 -20 225 0 317 32 139 97 245 300 488 186 221 240
310 307 502 l32 92 0 290 0 290 -33 95 c-18 52 -61 154 -97 225 -103 207 -209
316 -400 410 -144 72 -238 90 -460 88 -194 -1 -215 -7 -215 -62 0 -28 5 -43
18 -49 9 -5 82 -12 162 -16 235 -10 349 -41 485 -131 143 -95 243 -219 308
-383 57 -141 70 -223 70 -412 -1 -406 -86 -608 -406 -972 -117 -133 -170 -220
-215 -350 -116 -340 24 -729 352 -976 76 -58 249 -149 342 -180 209 -71 469
-85 697 -38 279 57 494 174 699 377 257 256 378 540 378 889 0 103 -5 146 -25
225 -39 157 -123 300 -221 380 l-45 37 21 36 c33 56 78 302 61 331 -5 7 -41
17 -81 22 -40 5 -75 11 -78 14 -2 2 -6 35 -8 72 -22 403 -38 538 -89 728 -152
580 -405 994 -886 1447 -188 177 -268 241 -496 398 -389 269 -901 464 -1397
535 -99 14 -425 36 -486 33 -14 -1 -97 -4 -185 -8z"/>
<path d="M2446 5430 c-70 -11 -124 -41 -200 -111 -74 -68 -120 -161 -142 -289
-52 -305 58 -798 216 -960 71 -73 124 -95 230 -95 107 0 172 27 231 94 114
129 143 301 137 811 l-3 305 -28 57 c-35 72 -110 140 -183 168 -66 24 -175 33
-258 20z m186 -340 c34 -42 39 -83 41 -385 1 -211 -1 -244 -16 -272 -35 -66
-122 -92 -175 -54 -58 41 -67 88 -66 346 1 248 8 353 28 380 32 44 146 35 188
-15z"/>
<path d="M7603 5430 c-118 -24 -229 -113 -266 -216 -37 -99 -47 -568 -17 -779
45 -314 174 -465 398 -465 96 0 158 27 225 99 76 82 111 169 148 377 30 167
34 565 6 664 -68 240 -261 366 -494 320z m153 -331 c31 -25 64 -116 75 -204
13 -106 5 -336 -15 -410 -40 -153 -118 -198 -196 -116 -56 59 -72 114 -78 271
-2 75 0 171 7 215 6 44 13 106 17 137 7 75 38 121 84 128 38 6 84 -3 106 -21z"/>
<path d="M4219 5372 c-199 -52 -323 -212 -364 -468 -6 -38 -4 -42 23 -53 65
-27 107 7 172 136 61 123 123 183 212 208 104 28 199 18 293 -30 41 -21 96
-60 122 -87 54 -54 76 -60 96 -22 29 56 -11 153 -93 228 -104 94 -296 131
-461 88z"/>
<path d="M5739 5278 c-135 -48 -216 -116 -250 -211 -13 -37 -13 -40 10 -58 37
-30 74 -25 105 16 15 19 51 47 79 62 45 23 65 27 142 27 76 0 97 -3 135 -23
113 -60 165 -112 263 -266 42 -65 73 -81 122 -61 58 24 50 47 -96 286 -112
185 -190 238 -359 247 -70 3 -96 0 -151 -19z"/>
<path d="M7247 5166 c-5 -15 -6 -31 -3 -34 8 -8 18 25 14 45 -3 13 -6 10 -11
-11z"/>
<path d="M4626 4169 c-50 -8 -86 -37 -114 -90 -20 -37 -23 -54 -18 -104 21
-232 161 -450 350 -544 175 -88 392 -92 571 -11 185 84 323 271 355 482 19
127 -9 233 -67 256 -30 13 -1008 22 -1077 11z m928 -169 c35 -13 40 -45 21
-121 -35 -134 -108 -240 -196 -284 -65 -33 -172 -48 -309 -43 -100 4 -122 8
-166 31 -89 45 -160 146 -189 269 -22 94 -20 137 8 148 30 12 799 13 831 0z"/>
<path d="M9980 552 c0 -4 21 -28 46 -52 55 -53 55 -40 2 19 -38 41 -48 48 -48
33z"/>
</g>
</svg>

After

Width:  |  Height:  |  Size: 4.9 KiB

BIN
assets/logos/02.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 168 KiB

73
assets/logos/02.svg Normal file
View File

@ -0,0 +1,73 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
version="1.0"
width="1024.000000pt"
height="1024.000000pt"
viewBox="0 0 1024.000000 1024.000000"
preserveAspectRatio="xMidYMid meet"
id="svg168"
sodipodi:docname="02.svg"
inkscape:version="1.2.2 (b0a8486541, 2022-12-01)"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns="http://www.w3.org/2000/svg"
xmlns:svg="http://www.w3.org/2000/svg">
<defs
id="defs172" />
<sodipodi:namedview
id="namedview170"
pagecolor="#ffffff"
bordercolor="#000000"
borderopacity="0.25"
inkscape:showpageshadow="2"
inkscape:pageopacity="0.0"
inkscape:pagecheckerboard="0"
inkscape:deskcolor="#d1d1d1"
inkscape:document-units="pt"
showgrid="false"
inkscape:zoom="0.69140625"
inkscape:cx="437.51412"
inkscape:cy="984.22599"
inkscape:window-width="1866"
inkscape:window-height="1012"
inkscape:window-x="0"
inkscape:window-y="0"
inkscape:window-maximized="1"
inkscape:current-layer="g166" />
<g
transform="translate(0.000000,1024.000000) scale(0.100000,-0.100000)"
fill="#000000"
stroke="none"
id="g166">
<rect
x="10"
y="10"
width="10239.509"
height="10229.297"
rx="1503.97427"
fill="#f0f0f0"
id="rect148"
style="stroke-width:10.1935" />
<path
d="M4784 8535 c-695 -66 -1296 -270 -1819 -616 -369 -245 -627 -477 -843 -763 -304 -402 -461 -948 -479 -1666 -9 -352 13 -581 82 -850 40 -156 61 -215 117 -323 55 -105 114 -169 194 -208 61 -30 69 -32 148 -27 179 12 320 123 356 281 8 38 6 64 -15 154 -14 59 -32 140 -41 178 -8 39 -21 95 -29 125 -41 165 -50 270 -50 565 0 261 3 309 28 480 30 214 28 242 -24 293 -41 40 -146 68 -312 84 -70 6 -127 15 -127 20 0 15 102 293 139 378 79 183 209 386 348 546 129 147 379 360 588 501 124 83 234 147 242 139 3 -3 -21 -36 -54 -73 -178 -203 -321 -426 -411 -643 -110 -265 -152 -484 -153 -804 -1 -338 43 -569 166 -877 56 -138 108 -235 192 -357 83 -119 95 -148 137 -323 54 -224 163 -505 223 -574 50 -57 102 -69 147 -34 46 36 34 86 -63 252 -65 113 -88 182 -107 332 -17 133 -20 142 -164 445 -148 313 -197 440 -250 650 -42 169 -60 311 -60 480 0 575 268 1118 733 1488 260 206 635 354 1060 418 142 21 566 26 722 9 323 -36 644 -133 905 -273 180 -96 322 -205 481 -368 464 -478 615 -1159 402 -1809 -22 -66 -78 -191 -142 -315 -275 -536 -251 -481 -271 -620 -10 -69 -28 -177 -40 -240 -27 -146 -37 -342 -20 -394 15 -47 51 -64 87 -41 73 49 164 319 184 549 17 208 39 271 158 461 197 313 285 530 342 845 31 167 34 543 6 685 -82 408 -210 682 -470 1005 -47 58 -83 107 -81 109 1 2 21 -7 43 -20 22 -13 77 -46 123 -73 324 -190 683 -538 883 -856 91 -145 268 -561 247 -582 -4 -3 -60 -16 -125 -27 -175 -31 -300 -80 -364 -141 -29 -26 -29 -54 -2 -190 64 -330 65 -751 3 -1081 -8 -46 -32 -145 -51 -219 -42 -157 -47 -246 -19 -329 20 -58 68 -118 120 -151 106 -65 273 -77 372 -27 140 71 251 273 328 592 55 229 76 429 76 725 0 991 -288 1664 -949 2213 -577 481 -1339 795 -2151 887 -154 18 -537 21 -696 5z"
id="path150" />
<path
d="M5963 4946 c-158 -51 -243 -191 -243 -398 0 -160 41 -281 122 -359 55 -53 99 -71 178 -72 55 -2 76 3 132 31 119 58 236 210 254 329 14 95 -50 278 -130 370 -72 82 -220 129 -313 99z m376 -302 c58 -49 66 -147 14 -198 -34 -34 -74 -34 -113 2 -57 50 -60 140 -8 193 36 36 67 37 107 3z"
id="path152" />
<path
d="M4089 4943 c-49 -8 -133 -66 -166 -116 -43 -64 -53 -102 -60 -224 -5 -91 -3 -110 21 -186 32 -103 76 -171 140 -214 126 -86 260 -73 354 33 73 82 97 158 97 310 0 121 0 121 -39 198 -51 101 -114 158 -203 186 -63 19 -88 22 -144 13z m-91 -294 c84 -29 79 -157 -8 -219 -65 -46 -110 -3 -113 107 -2 74 8 97 48 113 28 12 37 12 73 -1z"
id="path154" />
<path
d="M2585 3875 c-183 -29 -311 -98 -360 -194 -44 -88 -42 -163 6 -190 35 -20 65 -10 156 53 107 73 131 84 220 103 158 32 281 14 698 -102 301 -84 366 -93 423 -60 65 39 64 70 -5 146 -45 49 -65 58 -272 116 -516 143 -650 163 -866 128z"
id="path156" />
<path
d="M7350 3874 c-174 -23 -417 -78 -635 -145 -71 -22 -172 -49 -223 -59 -52 -10 -96 -21 -99 -24 -3 -3 -9 -24 -13 -48 -11 -57 7 -91 60 -112 74 -29 165 -17 465 63 295 79 375 94 505 94 82 1 131 -4 171 -16 58 -18 151 -69 188 -104 12 -11 38 -29 57 -39 31 -17 37 -17 62 -4 36 21 72 76 72 112 0 70 -68 167 -148 211 -77 42 -192 68 -317 72 -60 2 -126 1 -145 -1z"
id="path158" />
<path
d="M5404 3765 c-207 -147 -263 -172 -364 -162 -77 8 -129 33 -235 111 -86 63 -129 85 -142 73 -13 -13 50 -118 103 -170 82 -83 160 -119 277 -125 150 -8 252 32 350 137 70 75 111 171 73 171 -8 0 -36 -16 -62 -35z"
id="path160" />
<path
d="M3981 3144 c-266 -178 -442 -186 -926 -41 -302 91 -455 97 -612 23 -157 -75 -251 -214 -222 -330 21 -83 76 -225 110 -284 100 -170 378 -370 654 -471 376 -137 757 -167 1102 -86 278 66 504 187 689 370 108 107 176 197 239 316 25 49 51 91 56 95 16 9 31 -9 94 -111 196 -316 448 -516 810 -641 336 -117 740 -122 1125 -14 297 84 533 213 711 389 164 163 221 269 222 416 1 66 -4 90 -25 135 -78 168 -302 263 -558 237 -41 -4 -156 -30 -255 -57 -223 -62 -311 -79 -446 -87 -183 -10 -352 31 -554 135 l-98 50 -22 -24 c-40 -44 -49 -77 -30 -117 29 -63 136 -154 230 -198 114 -54 192 -70 367 -76 177 -7 282 9 503 72 280 81 392 93 508 54 106 -35 157 -84 157 -151 0 -51 -59 -145 -134 -215 -226 -211 -559 -347 -961 -393 -216 -24 -499 5 -699 72 -314 105 -535 288 -671 556 -42 84 -31 81 -206 56 -100 -14 -118 -14 -186 0 -41 9 -79 16 -84 16 -5 0 -22 -30 -39 -66 -112 -249 -373 -466 -681 -568 -355 -118 -819 -76 -1207 109 -284 136 -425 272 -474 458 -11 41 -10 52 3 75 33 60 129 94 259 95 83 0 151 -15 325 -68 353 -109 499 -125 706 -75 157 38 305 134 365 236 23 39 24 48 14 78 -13 41 -47 86 -63 86 -7 0 -50 -25 -96 -56z"
id="path162" />
</g>
</svg>

After

Width:  |  Height:  |  Size: 5.8 KiB

View File

@ -0,0 +1,137 @@
<?xml version="1.0" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 20010904//EN"
"http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd">
<svg version="1.0" xmlns="http://www.w3.org/2000/svg"
width="1024.000000pt" height="1024.000000pt" viewBox="0 0 1024.000000 1024.000000"
preserveAspectRatio="xMidYMid meet">
<g transform="translate(0.000000,1024.000000) scale(0.100000,-0.100000)"
fill="#000000" stroke="none">
<path d="M1884 10147 c-438 -456 -723 -1077 -825 -1797 -30 -207 -33 -589 -6
-688 26 -97 31 -92 -81 -91 -349 0 -651 -131 -891 -388 l-81 -86 0 -313 c0
-173 2 -314 4 -314 2 0 15 17 27 38 105 172 237 352 259 352 4 0 -3 -39 -16
-87 -179 -642 -244 -1229 -215 -1938 11 -258 41 -647 62 -785 5 -36 14 -99 19
-140 16 -113 67 -403 106 -600 8 -41 26 -119 40 -172 14 -53 24 -105 22 -115
-3 -19 -114 198 -212 417 -31 69 -66 139 -76 155 l-20 30 0 -296 0 -295 68
-105 c113 -172 229 -298 351 -380 64 -44 227 -124 300 -149 36 -12 114 -33
175 -47 130 -29 314 -35 462 -14 50 8 97 12 104 9 19 -6 -282 -123 -407 -158
-163 -46 -305 -64 -458 -57 -140 7 -176 14 -359 72 -92 29 -127 36 -158 31
-76 -12 -78 -15 -78 -124 l0 -97 43 -26 c76 -48 103 -58 218 -83 203 -44 260
-51 424 -50 322 2 609 85 1027 296 148 74 162 79 250 89 98 10 132 24 358 144
81 43 150 59 150 35 0 -20 -48 -105 -96 -170 -132 -181 -370 -374 -601 -489
-241 -120 -476 -181 -804 -210 -128 -11 -181 -34 -255 -113 -74 -78 -97 -144
-95 -278 0 -60 7 -130 16 -163 55 -211 304 -437 615 -560 127 -50 353 -97 470
-97 57 0 222 23 252 35 13 5 23 -7 47 -57 67 -141 187 -236 371 -292 l85 -27
450 3 450 3 118 38 c222 71 402 159 557 273 104 76 282 259 339 349 28 44 55
85 60 91 11 14 395 24 886 24 l361 0 60 -88 c225 -333 663 -595 1091 -652 116
-16 379 -7 482 15 184 41 364 115 512 211 94 61 233 199 289 286 l43 67 56
-28 c299 -150 843 -78 1089 144 63 57 112 139 112 184 -1 35 -32 101 -61 128
-13 12 -105 64 -205 116 -206 107 -221 117 -286 170 -72 58 -106 112 -234 367
-170 340 -232 438 -363 573 -36 37 -63 67 -60 67 4 0 27 -9 53 -20 25 -12 139
-43 253 -70 115 -27 246 -58 293 -69 47 -11 122 -27 167 -35 65 -11 117 -32
250 -96 92 -45 192 -94 220 -108 29 -14 59 -33 68 -42 14 -15 8 -16 -68 -17
-45 -1 -134 -2 -197 -3 -127 -3 -283 23 -366 59 -26 12 -52 21 -58 21 -19 0
-130 77 -195 135 -67 60 -81 66 -81 34 0 -29 104 -129 182 -174 69 -41 195
-85 303 -106 80 -16 480 -20 530 -5 l30 9 -30 14 -30 14 30 -6 c99 -21 159
-17 298 19 149 38 249 77 321 122 l45 29 1 182 0 183 -113 -98 c-63 -53 -137
-110 -166 -125 -133 -73 -289 -87 -426 -39 -90 31 -199 96 -192 115 2 7 35 15
80 19 139 13 270 39 395 80 127 41 299 117 365 162 21 14 42 26 47 26 6 0 10
86 10 218 0 215 0 217 -19 187 -17 -28 -97 -107 -166 -165 -105 -87 -270 -174
-410 -216 -108 -32 -214 -56 -228 -51 -5 1 2 16 15 32 96 118 238 513 307 855
89 444 125 846 126 1405 0 395 -6 505 -51 866 -42 343 -63 459 -130 723 -63
243 -167 480 -309 701 -98 152 -120 194 -107 207 17 17 176 22 259 9 101 -16
160 -33 247 -73 189 -86 279 -179 415 -433 18 -33 37 -64 42 -70 5 -5 9 89 9
240 l-1 250 -77 69 c-151 134 -339 226 -560 271 -62 12 -115 25 -118 28 -4 3
2 30 12 59 16 46 19 83 18 278 -1 231 -12 334 -62 575 -71 338 -235 765 -393
1022 -23 38 -54 89 -70 115 -97 163 -237 350 -377 506 l-74 82 -177 0 -176 0
91 -92 c204 -208 333 -376 488 -631 186 -307 325 -643 405 -977 69 -291 80
-381 56 -468 -10 -34 -23 -67 -31 -72 -9 -8 -80 -9 -234 -4 -266 8 -247 -1
-340 171 -252 464 -543 835 -890 1133 -562 482 -1190 792 -1850 912 -121 22
-145 23 -690 23 l-565 0 -120 -27 c-66 -15 -163 -36 -215 -47 -604 -133 -1184
-448 -1625 -885 -276 -274 -477 -557 -660 -931 -74 -153 -135 -251 -208 -340
-23 -28 -30 -30 -94 -31 -37 -1 -122 -5 -188 -9 -153 -10 -177 -1 -200 77 -45
151 0 415 140 835 162 483 398 878 756 1266 l91 97 -165 0 -164 0 -89 -93z
m3676 -162 c570 -67 1152 -279 1613 -586 465 -311 866 -729 1110 -1159 78
-137 182 -371 217 -489 25 -82 48 -224 42 -257 -5 -30 -30 -40 -162 -68 -184
-38 -221 -56 -315 -150 -73 -72 -89 -95 -143 -206 -87 -177 -126 -307 -178
-598 -9 -50 -20 -95 -25 -98 -6 -3 -21 1 -34 10 -32 21 -288 130 -405 173
-311 113 -759 239 -1045 293 -437 83 -625 101 -1115 107 -430 5 -548 0 -846
-38 -568 -71 -1208 -279 -1583 -514 -57 -36 -107 -65 -111 -65 -4 0 -11 17
-15 38 -24 135 -126 458 -183 583 -83 184 -243 382 -388 480 -39 26 -74 54
-77 61 -5 13 19 80 70 203 14 33 50 121 80 195 284 706 734 1232 1389 1623
420 251 966 428 1449 470 122 11 541 6 655 -8z m-4078 -2636 c256 -52 479
-221 607 -461 44 -81 99 -235 124 -343 57 -249 114 -543 147 -770 63 -417 58
-1157 -11 -1695 -14 -112 -67 -419 -84 -489 -32 -134 -66 -252 -95 -331 -17
-47 -39 -107 -49 -135 -26 -71 -120 -260 -164 -327 -20 -31 -53 -69 -72 -83
-75 -58 -219 -101 -355 -105 -85 -3 -82 -7 -46 68 36 74 50 105 90 202 15 36
32 74 37 85 25 55 94 258 125 367 42 150 75 287 93 388 34 184 59 412 76 690
22 348 21 450 -16 930 -25 333 -111 861 -198 1215 -25 103 -101 327 -131 385
-34 67 -108 179 -152 229 -76 88 -82 141 -17 173 18 10 35 18 36 18 1 0 26 -5
55 -11z m-491 -123 c299 -144 526 -765 614 -1681 45 -472 38 -1171 -16 -1595
-66 -520 -179 -884 -368 -1194 -57 -94 -141 -196 -161 -196 -43 0 -215 142
-313 258 -103 123 -219 414 -277 698 -129 623 -153 1928 -49 2609 75 495 180
842 308 1021 63 87 93 106 166 102 33 -2 76 -12 96 -22z m7908 -18 c25 -12 82
-58 126 -102 194 -195 347 -570 454 -1111 58 -292 85 -524 106 -903 40 -715
-2 -1279 -136 -1817 -78 -314 -216 -629 -307 -705 -57 -47 -150 -90 -195 -90
-53 0 -122 36 -164 85 -46 53 -240 432 -305 595 -116 290 -148 590 -163 1520
-18 1111 53 1735 250 2179 48 108 141 269 188 326 45 53 74 58 146 23z m-605
-168 c30 -11 33 -32 10 -86 -143 -338 -218 -739 -254 -1349 -19 -324 -9 -1015
20 -1465 13 -193 15 -279 6 -287 -21 -22 -82 112 -121 266 -115 457 -123 1127
-20 1782 15 94 25 199 25 262 0 113 13 177 85 432 56 202 96 305 152 398 36
59 49 66 97 47z m-2813 -365 c222 -23 294 -32 429 -55 656 -110 1118 -249
1549 -464 153 -77 210 -111 218 -133 3 -9 -3 -113 -15 -232 -12 -119 -27 -290
-34 -381 -17 -228 -17 -856 0 -990 20 -163 59 -374 92 -500 17 -63 37 -167 45
-230 8 -63 22 -138 30 -167 31 -106 152 -227 270 -272 33 -13 67 -26 75 -30
16 -8 294 -548 306 -595 l7 -29 -79 6 c-48 3 -133 22 -219 47 -613 182 -800
219 -1155 227 -242 6 -372 -6 -485 -46 -111 -39 -143 -80 -152 -195 -10 -127
32 -283 149 -561 189 -449 438 -737 743 -860 151 -61 228 -76 395 -79 179 -3
228 -15 270 -63 38 -43 40 -102 7 -229 -29 -107 -67 -185 -126 -256 -136 -162
-311 -251 -546 -278 -441 -50 -865 95 -1184 403 -150 145 -244 310 -401 707
-159 398 -271 599 -425 755 -82 84 -114 109 -179 141 -70 34 -90 39 -164 43
-100 4 -167 -12 -189 -46 -30 -45 -66 -169 -198 -673 -83 -315 -163 -522 -276
-710 -241 -406 -581 -631 -1089 -722 -46 -8 -148 -13 -265 -12 -160 0 -204 3
-277 22 -197 49 -294 130 -347 289 -46 136 -15 274 74 336 22 15 111 52 197
82 286 99 398 165 573 339 147 147 244 288 322 466 111 254 162 475 142 616
-13 96 -23 121 -62 162 -66 69 -159 26 -190 -89 -8 -30 -29 -130 -47 -224 -40
-205 -75 -318 -142 -449 -108 -210 -243 -352 -453 -478 -109 -65 -161 -87
-309 -128 -161 -44 -210 -68 -290 -141 -95 -86 -150 -235 -127 -339 16 -70 8
-74 -132 -66 -302 17 -623 128 -771 267 -177 165 -178 393 -3 428 29 6 111 20
182 31 260 41 443 97 651 199 227 111 379 220 588 421 149 144 363 413 430
539 41 79 86 211 86 255 0 41 -32 91 -71 112 -53 27 -110 15 -265 -59 -76 -36
-209 -93 -294 -127 -85 -33 -166 -65 -179 -71 -13 -5 -26 -7 -29 -5 -5 6 30
77 170 346 79 153 111 224 183 410 90 232 107 267 168 347 68 91 71 113 30
230 -32 93 -33 127 -19 818 9 461 4 630 -30 1060 -18 234 -19 249 -10 263 18
29 135 99 296 178 532 261 1153 421 1760 454 204 11 611 3 791 -15z m1629
-4090 c183 -13 507 -96 635 -163 174 -92 443 -407 684 -802 136 -223 184 -291
265 -378 71 -75 128 -115 188 -133 18 -5 85 -14 148 -19 160 -14 170 -17 185
-53 16 -39 6 -69 -38 -116 -104 -109 -427 -145 -776 -86 -88 15 -161 50 -185
88 -8 12 -16 60 -19 107 -11 166 -52 230 -192 295 -158 73 -181 104 -250 340
-14 50 -46 153 -70 230 -24 77 -49 168 -55 202 -29 147 -158 205 -294 133 -57
-29 -122 -77 -183 -134 -28 -26 -33 -37 -33 -77 0 -55 19 -104 57 -149 36 -43
79 -60 151 -60 47 0 73 7 140 40 l82 39 14 -22 c37 -56 134 -421 122 -453 -15
-38 -228 -2 -340 57 -126 65 -291 222 -400 379 -123 175 -246 488 -246 625 0
73 6 83 54 96 42 12 214 28 251 23 11 -2 58 -5 105 -9z m-2122 -601 c116 -58
283 -291 390 -544 53 -126 152 -388 152 -403 0 -8 -10 -17 -22 -21 -30 -8
-772 -8 -820 0 -20 4 -39 12 -42 19 -2 7 2 44 9 81 7 38 27 139 44 224 50 258
131 551 178 643 17 34 43 35 111 1z"/>
<path d="M3903 5575 c-345 -79 -564 -312 -613 -650 -35 -245 31 -497 178 -687
151 -195 345 -300 597 -324 145 -13 345 39 495 129 112 67 260 215 313 312
134 245 140 552 14 785 -109 202 -291 344 -532 415 -75 22 -114 27 -240 31
-106 2 -168 -1 -212 -11z m8 -326 c127 -55 196 -232 138 -359 -58 -129 -200
-188 -345 -145 -147 43 -214 213 -146 368 26 58 109 142 153 156 48 15 141 5
200 -20z"/>
<path d="M3696 5084 c-11 -30 -6 -53 20 -83 29 -35 59 -39 92 -12 30 26 30 74
0 95 -30 21 -104 21 -112 0z"/>
<path d="M6154 5579 c-119 -20 -277 -91 -375 -169 -124 -98 -216 -227 -268
-375 -36 -104 -42 -313 -12 -440 66 -280 279 -530 531 -623 261 -96 558 -54
781 112 187 139 306 323 354 549 26 121 12 297 -34 418 -117 309 -471 542
-820 538 -58 -1 -129 -5 -157 -10z m60 -313 c87 -51 140 -150 140 -261 -1
-111 -53 -188 -151 -225 -180 -67 -352 50 -353 240 0 117 53 214 142 259 60
31 157 25 222 -13z"/>
<path d="M6016 5065 c-9 -9 -16 -23 -16 -32 0 -25 30 -81 47 -87 20 -8 56 23
48 41 -2 8 0 11 5 8 17 -11 11 32 -7 53 -25 29 -58 36 -77 17z m50 -53 c1 -7
0 -8 -3 -2 -2 5 -9 8 -14 4 -5 -3 -9 0 -9 6 0 15 23 7 26 -8z"/>
<path d="M10020 505 c24 -24 46 -42 49 -40 6 7 -70 85 -83 85 -6 0 10 -20 34
-45z"/>
</g>
</svg>

After

Width:  |  Height:  |  Size: 9.3 KiB

View File

@ -0,0 +1,66 @@
DROP TABLE IF EXISTS artist;
CREATE TABLE artist (
id TEXT PRIMARY KEY NOT NULL,
name TEXT
);
DROP TABLE IF EXISTS artist_release_group;
CREATE TABLE artist_release_group (
artist_id TEXT NOT NULL,
release_group_id TEXT NOT NULL
);
DROP TABLE IF EXISTS artist_track;
CREATE TABLE artist_track (
artist_id TEXT NOT NULL,
track_id TEXT NOT NULL
);
DROP TABLE IF EXISTS release_group;
CREATE TABLE release_group (
id TEXT PRIMARY KEY NOT NULL,
albumartist TEXT,
albumsort INT,
musicbrainz_albumtype TEXT,
compilation TEXT,
album_artist_id TEXT
);
DROP TABLE IF EXISTS release_;
CREATE TABLE release_ (
id TEXT PRIMARY KEY NOT NULL,
release_group_id TEXT NOT NULL,
title TEXT,
copyright TEXT,
album_status TEXT,
language TEXT,
year TEXT,
date TEXT,
country TEXT,
barcode TEXT
);
DROP TABLE IF EXISTS track;
CREATE TABLE track (
id TEXT PRIMARY KEY NOT NULL,
downloaded BOOLEAN NOT NULL DEFAULT 0,
release_id TEXT NOT NULL,
track TEXT,
length INT,
tracknumber TEXT,
isrc TEXT,
genre TEXT,
lyrics TEXT,
path TEXT,
file TEXT,
url TEXT,
src TEXT
);
DROP TABLE IF EXISTS source;
CREATE TABLE source (
track_id TEXT NOT NULL,
src TEXT NOT NULL,
url TEXT NOT NULL,
valid BOOLEAN NOT NULL DEFAULT 1
);

View File

@ -1,13 +1,14 @@
import logging
import music_kraken
import logging
print("Setting logging-level to DEBUG")
logging.getLogger().setLevel(logging.DEBUG)
if __name__ == "__main__":
commands = [
"s: #a Ghost Bath",
"0",
"d: 1",
]

View File

@ -2,24 +2,30 @@ import music_kraken
from music_kraken.objects import Song, Album, Artist, Collection
if __name__ == "__main__":
song_1 = Song(
title="song",
feature_artist_list=[Artist(
name="main_artist"
)]
album_1 = Album(
title="album",
song_list=[
Song(title="song", main_artist_list=[Artist(name="artist")]),
],
artist_list=[
Artist(name="artist 3"),
]
)
other_artist = Artist(name="other_artist")
song_2 = Song(
title = "song",
artist_list=[other_artist]
album_2 = Album(
title="album",
song_list=[
Song(title="song", main_artist_list=[Artist(name="artist 2")]),
],
artist_list=[
Artist(name="artist"),
]
)
other_artist.name = "main_artist"
album_1.merge(album_2)
song_1.merge(song_2)
print()
print(*(f"{a.title_string} ; {a.id}" for a in album_1.artist_collection.data), sep=" | ")
print("#" * 120)
print("main", *song_1.artist_collection)
print("feat", *song_1.feature_artist_collection)
print(id(album_1.artist_collection), id(album_2.artist_collection))
print(id(album_1.song_collection[0].main_artist_collection), id(album_2.song_collection[0].main_artist_collection))

View File

@ -10,12 +10,12 @@ from ..objects import Target
LOGGER = logging_settings["codex_logger"]
def correct_codec(target: Target, bitrate_kb: int = main_settings["bitrate"], audio_format: str = main_settings["audio_format"], skip_intervals: List[Tuple[float, float]] = None):
def correct_codec(target: Target, bitrate_kb: int = main_settings["bitrate"], audio_format: str = main_settings["audio_format"], interval_list: List[Tuple[float, float]] = None):
if not target.exists:
LOGGER.warning(f"Target doesn't exist: {target.file_path}")
return
skip_intervals = skip_intervals or []
interval_list = interval_list or []
bitrate_b = int(bitrate_kb / 1024)
@ -29,7 +29,7 @@ def correct_codec(target: Target, bitrate_kb: int = main_settings["bitrate"], au
start = 0
next_start = 0
for end, next_start in skip_intervals:
for end, next_start in interval_list:
aselect_list.append(f"between(t,{start},{end})")
start = next_start
aselect_list.append(f"gte(t,{next_start})")
@ -47,7 +47,7 @@ def correct_codec(target: Target, bitrate_kb: int = main_settings["bitrate"], au
# run the ffmpeg command with a progressbar
ff = FfmpegProgress(ffmpeg_command)
with tqdm(total=100, desc=f"processing") as pbar:
with tqdm(total=100, desc=f"removing {len(interval_list)} segments") as pbar:
for progress in ff.run_command_with_progress():
pbar.update(progress-pbar.n)

View File

@ -1,21 +1,20 @@
import logging
import mutagen
from mutagen.id3 import ID3, Frame, APIC
from pathlib import Path
from typing import List
import mutagen
from mutagen.id3 import APIC, ID3, USLT, Frame
import logging
from PIL import Image
from ..connection import Connection
from ..objects import Metadata, Song, Target
from ..objects.metadata import Mapping
from ..utils.config import logging_settings, main_settings
from ..objects import Song, Target, Metadata
from ..connection import Connection
LOGGER = logging_settings["tagging_logger"]
artwork_connection: Connection = Connection()
class AudioMetadata:
def __init__(self, file_location: str = None) -> None:
self._file_location = None
@ -67,21 +66,18 @@ def write_metadata_to_target(metadata: Metadata, target: Target, song: Song):
id3_object = AudioMetadata(file_location=target.file_path)
LOGGER.info(str(metadata))
## REWRITE COMPLETLY !!!!!!!!!!!!
if len(song.artwork._data) != 0:
variants = song.artwork._data.__getitem__(0)
best_variant = variants.variants.__getitem__(0)
if song.artwork.best_variant is not None:
r = artwork_connection.get(
url=best_variant.url,
name=best_variant.url,
url=song.artwork.best_variant["url"],
disable_cache=False,
)
temp_target: Target = Target.temp()
with temp_target.open("wb") as f:
f.write(r.content)
converted_target: Target = Target.temp(name=f"{song.title.replace('/', '_')}")
converted_target: Target = Target.temp(name=f"{song.title}.jpeg")
with Image.open(temp_target.file_path) as img:
# crop the image if it isn't square in the middle with minimum data loss
width, height = img.size
@ -94,10 +90,6 @@ def write_metadata_to_target(metadata: Metadata, target: Target, song: Song):
# resize the image to the preferred resolution
img.thumbnail((main_settings["preferred_artwork_resolution"], main_settings["preferred_artwork_resolution"]))
# https://stackoverflow.com/a/59476938/16804841
if img.mode != 'RGB':
img = img.convert('RGB')
img.save(converted_target.file_path, "JPEG")
# https://stackoverflow.com/questions/70228440/mutagen-how-can-i-correctly-embed-album-art-into-mp3-file-so-that-i-can-see-t
@ -108,14 +100,11 @@ def write_metadata_to_target(metadata: Metadata, target: Target, song: Song):
mime="image/jpeg",
type=3,
desc=u"Cover",
data=converted_target.raw_content,
data=converted_target.read_bytes(),
)
)
id3_object.frames.delall("USLT")
uslt_val = metadata.get_id3_value(Mapping.UNSYNCED_LYRICS)
id3_object.frames.add(
USLT(encoding=3, lang=u'eng', desc=u'desc', text=uslt_val)
)
mutagen_file = mutagen.File(target.file_path)
id3_object.add_metadata(metadata)
id3_object.save()

View File

@ -6,18 +6,14 @@ import re
from .utils import cli_function
from .options.first_config import initial_config
from ..utils import output, BColors
from ..utils.config import write_config, main_settings
from ..utils.shared import URL_PATTERN
from ..utils.string_processing import fit_to_file_system
from ..utils.support_classes.query import Query
from ..utils.support_classes.download_result import DownloadResult
from ..utils.exception import MKInvalidInputException
from ..utils.exception.download import UrlNotFoundException
from ..utils.enums.colors import BColors
from .. import console
from ..download.results import Results, Option, PageResults, GoToResults
from ..download.results import Results, Option, PageResults
from ..download.page_attributes import Pages
from ..pages import Page
from ..objects import Song, Album, Artist, DatabaseObject
@ -166,9 +162,9 @@ class Downloader:
self.genre = genre or get_genre()
self.process_metadata_anyway = process_metadata_anyway
output()
output(f"Downloading to: \"{self.genre}\"", color=BColors.HEADER)
output()
print()
print(f"Downloading to: \"{self.genre}\"")
print()
def print_current_options(self):
self.page_dict = dict()
@ -176,14 +172,14 @@ class Downloader:
print()
page_count = 0
for option in self.current_results.formatted_generator():
for option in self.current_results.formated_generator(max_items_per_page=self.max_displayed_options):
if isinstance(option, Option):
r = f"{BColors.GREY.value}{option.index:0{self.option_digits}}{BColors.ENDC.value} {option.music_object.option_string}"
print(r)
color = BColors.BOLD.value if self.pages.is_downloadable(option.music_object) else BColors.GREY.value
print(f"{color}{option.index:0{self.option_digits}} {option.music_object.option_string}{BColors.ENDC.value}")
else:
prefix = ALPHABET[page_count % len(ALPHABET)]
print(
f"{BColors.HEADER.value}({prefix}) --------------------------------{option.__name__:{PAGE_NAME_FILL}<{MAX_PAGE_LEN}}--------------------{BColors.ENDC.value}")
f"{BColors.HEADER.value}({prefix}) ------------------------{option.__name__:{PAGE_NAME_FILL}<{MAX_PAGE_LEN}}------------{BColors.ENDC.value}")
self.page_dict[prefix] = option
self.page_dict[option.__name__] = option
@ -215,9 +211,6 @@ class Downloader:
return True
def _process_parsed(self, key_text: Dict[str, str], query: str) -> Query:
# strip all the values in key_text
key_text = {key: value.strip() for key, value in key_text.items()}
song = None if not "t" in key_text else Song(title=key_text["t"], dynamic=True)
album = None if not "r" in key_text else Album(title=key_text["r"], dynamic=True)
artist = None if not "a" in key_text else Artist(name=key_text["a"], dynamic=True)
@ -226,7 +219,7 @@ class Downloader:
if album is not None:
song.album_collection.append(album)
if artist is not None:
song.artist_collection.append(artist)
song.main_artist_collection.append(artist)
return Query(raw_query=query, music_object=song)
if album is not None:
@ -249,7 +242,7 @@ class Downloader:
f"Recommendations and suggestions on sites to implement appreciated.\n"
f"But don't be a bitch if I don't end up implementing it.")
return
self.set_current_options(PageResults(page, data_object.options, max_items_per_page=self.max_displayed_options))
self.set_current_options(PageResults(page, data_object.options))
self.print_current_options()
return
@ -299,39 +292,66 @@ class Downloader:
self.set_current_options(self.pages.search(parsed_query))
self.print_current_options()
def goto(self, data_object: DatabaseObject):
def goto(self, index: int):
page: Type[Page]
music_object: DatabaseObject
self.pages.fetch_details(data_object, stop_at_level=1)
try:
page, music_object = self.current_results.get_music_object_by_index(index)
except KeyError:
print()
print(f"The option {index} doesn't exist.")
print()
return
self.set_current_options(GoToResults(data_object.options, max_items_per_page=self.max_displayed_options))
self.pages.fetch_details(music_object)
print(music_object)
print(music_object.options)
self.set_current_options(PageResults(page, music_object.options))
self.print_current_options()
def download(self, data_objects: List[DatabaseObject], **kwargs) -> bool:
output()
if len(data_objects) > 1:
output(f"Downloading {len(data_objects)} objects...", *("- " + o.option_string for o in data_objects), color=BColors.BOLD, sep="\n")
def download(self, download_str: str, download_all: bool = False) -> bool:
to_download: List[DatabaseObject] = []
if re.match(URL_PATTERN, download_str) is not None:
_, music_objects = self.pages.fetch_url(download_str)
to_download.append(music_objects)
else:
index: str
for index in download_str.split(", "):
if not index.strip().isdigit():
print()
print(f"Every download thingie has to be an index, not {index}.")
print()
return False
for index in download_str.split(", "):
to_download.append(self.current_results.get_music_object_by_index(int(index))[1])
print()
print("Downloading:")
for download_object in to_download:
print(download_object.option_string)
print()
_result_map: Dict[DatabaseObject, DownloadResult] = dict()
for database_object in data_objects:
r = self.pages.download(
data_object=database_object,
genre=self.genre,
**kwargs
)
for database_object in to_download:
r = self.pages.download(music_object=database_object, genre=self.genre, download_all=download_all,
process_metadata_anyway=self.process_metadata_anyway)
_result_map[database_object] = r
for music_object, result in _result_map.items():
output()
output(music_object.option_string)
output(result)
print()
print(music_object.option_string)
print(result)
return True
def process_input(self, input_str: str) -> bool:
try:
input_str = input_str.strip()
processed_input: str = input_str.lower()
@ -347,80 +367,20 @@ class Downloader:
self.print_current_options()
return False
command = ""
query = processed_input
if ":" in processed_input:
_ = processed_input.split(":")
command, query = _[0], ":".join(_[1:])
do_search = "s" in command
do_fetch = "f" in command
do_download = "d" in command
do_merge = "m" in command
if do_search and (do_download or do_fetch or do_merge):
raise MKInvalidInputException(message="You can't search and do another operation at the same time.")
if do_search:
self.search(":".join(input_str.split(":")[1:]))
if processed_input.startswith("s: "):
self.search(input_str[3:])
return False
def get_selected_objects(q: str):
if q.strip().lower() == "all":
return list(self.current_results)
if processed_input.startswith("d: "):
return self.download(input_str[3:])
indices = []
for possible_index in q.split(","):
possible_index = possible_index.strip()
if possible_index == "":
continue
i = 0
try:
i = int(possible_index)
except ValueError:
raise MKInvalidInputException(message=f"The index \"{possible_index}\" is not a number.")
if i < 0 or i >= len(self.current_results):
raise MKInvalidInputException(message=f"The index \"{i}\" is not within the bounds of 0-{len(self.current_results) - 1}.")
indices.append(i)
return [self.current_results[i] for i in indices]
selected_objects = get_selected_objects(query)
if do_merge:
old_selected_objects = selected_objects
a = old_selected_objects[0]
for b in old_selected_objects[1:]:
if type(a) != type(b):
raise MKInvalidInputException(message="You can't merge different types of objects.")
a.merge(b)
selected_objects = [a]
if do_fetch:
for data_object in selected_objects:
self.pages.fetch_details(data_object)
self.print_current_options()
if processed_input.isdigit():
self.goto(int(processed_input))
return False
if do_download:
self.download(selected_objects)
return False
if len(selected_objects) != 1:
raise MKInvalidInputException(message="You can only go to one object at a time without merging.")
self.goto(selected_objects[0])
return False
except MKInvalidInputException as e:
output("\n" + e.message + "\n", color=BColors.FAIL)
if processed_input != "help":
print(f"{BColors.WARNING.value}Invalid input.{BColors.ENDC.value}")
help_message()
return False
def mainloop(self):

View File

@ -1,14 +1,12 @@
import json
from pathlib import Path
from dataclasses import dataclass, field
from dataclasses import dataclass
from datetime import datetime, timedelta
from typing import List, Optional
from functools import lru_cache
import logging
from ..utils import output, BColors
from ..utils.config import main_settings
from ..utils.string_processing import fit_to_file_system
@dataclass
@ -19,8 +17,6 @@ class CacheAttribute:
created: datetime
expires: datetime
additional_info: dict = field(default_factory=dict)
@property
def id(self):
return f"{self.module}_{self.name}"
@ -35,12 +31,6 @@ class CacheAttribute:
return self.__dict__ == other.__dict__
@dataclass
class CacheResult:
content: bytes
attribute: CacheAttribute
class Cache:
def __init__(self, module: str, logger: logging.Logger):
self.module = module
@ -58,7 +48,6 @@ class Cache:
self._time_fields = {"created", "expires"}
with self.index.open("r") as i:
try:
for c in json.loads(i.read()):
for key in self._time_fields:
c[key] = datetime.fromisoformat(c[key])
@ -66,8 +55,6 @@ class Cache:
ca = CacheAttribute(**c)
self.cached_attributes.append(ca)
self._id_to_attribute[ca.id] = ca
except json.JSONDecodeError:
pass
@lru_cache()
def _init_module(self, module: str) -> Path:
@ -76,7 +63,7 @@ class Cache:
:return: the module path
"""
r = Path(self._dir, module)
r.mkdir(exist_ok=True, parents=True)
r.mkdir(exist_ok=True)
return r
def _write_index(self, indent: int = 4):
@ -112,7 +99,7 @@ class Cache:
return True
def set(self, content: bytes, name: str, expires_in: float = 10, module: str = "", additional_info: dict = None):
def set(self, content: bytes, name: str, expires_in: float = 10, module: str = ""):
"""
:param content:
:param module:
@ -123,7 +110,6 @@ class Cache:
if name == "":
return
additional_info = additional_info or {}
module = self.module if module == "" else module
module_path = self._init_module(module)
@ -133,31 +119,27 @@ class Cache:
name=name,
created=datetime.now(),
expires=datetime.now() + timedelta(days=expires_in),
additional_info=additional_info,
)
self._write_attribute(cache_attribute)
cache_path = fit_to_file_system(Path(module_path, name.replace("/", "_")), hidden_ok=True)
cache_path = Path(module_path, name)
with cache_path.open("wb") as content_file:
self.logger.debug(f"writing cache to {cache_path}")
content_file.write(content)
def get(self, name: str) -> Optional[CacheResult]:
path = fit_to_file_system(Path(self._dir, self.module, name.replace("/", "_")), hidden_ok=True)
def get(self, name: str) -> Optional[bytes]:
path = Path(self._dir, self.module, name)
if not path.is_file():
return None
# check if it is outdated
if f"{self.module}_{name}" not in self._id_to_attribute:
path.unlink()
return
existing_attribute: CacheAttribute = self._id_to_attribute[f"{self.module}_{name}"]
if not existing_attribute.is_valid:
return
with path.open("rb") as f:
return CacheResult(content=f.read(), attribute=existing_attribute)
return f.read()
def clean(self):
keep = set()
@ -166,7 +148,7 @@ class Cache:
if ca.name == "":
continue
file = fit_to_file_system(Path(self._dir, ca.module, ca.name.replace("/", "_")), hidden_ok=True)
file = Path(self._dir, ca.module, ca.name)
if not ca.is_valid:
self.logger.debug(f"deleting cache {ca.id}")
@ -205,12 +187,9 @@ class Cache:
for path in self._dir.iterdir():
if path.is_dir():
for file in path.iterdir():
output(f"Deleting file {file}", color=BColors.GREY)
file.unlink()
output(f"Deleting folder {path}", color=BColors.HEADER)
path.rmdir()
else:
output(f"Deleting folder {path}", color=BColors.HEADER)
path.unlink()
self.cached_attributes.clear()

View File

@ -1,12 +1,12 @@
from __future__ import annotations
import copy
import inspect
import logging
import threading
import time
from typing import TYPE_CHECKING, Dict, List, Optional, Set
from urllib.parse import ParseResult, urlparse, urlunsplit
from typing import List, Dict, Optional, Set
from urllib.parse import urlparse, urlunsplit, ParseResult
import copy
import inspect
import requests
import responses
@ -14,15 +14,10 @@ from tqdm import tqdm
from .cache import Cache
from .rotating import RotatingProxy
if TYPE_CHECKING:
from ..objects import Target
from ..utils import request_trace
from ..objects import Target
from ..utils.config import main_settings
from ..utils.hacking import merge_args
from ..utils.string_processing import shorten_display_url
from ..utils.support_classes.download_result import DownloadResult
from ..utils.hacking import merge_args
class Connection:
@ -128,17 +123,12 @@ class Connection:
return headers
def save(self, r: requests.Response, name: str, error: bool = False, no_update_if_valid_exists: bool = False, **kwargs):
def save(self, r: requests.Response, name: str, error: bool = False, **kwargs):
n_kwargs = {}
if error:
n_kwargs["module"] = "failed_requests"
if self.cache.get(name) is not None and no_update_if_valid_exists:
return
self.cache.set(r.content, name, expires_in=kwargs.get("expires_in", self.cache_expiring_duration), additional_info={
"encoding": r.encoding,
}, **n_kwargs)
self.cache.set(r.content, name, expires_in=kwargs.get("expires_in", self.cache_expiring_duration), **n_kwargs)
def request(
self,
@ -153,7 +143,6 @@ class Connection:
sleep_after_404: float = None,
is_heartbeat: bool = False,
disable_cache: bool = None,
enable_cache_readonly: bool = False,
method: str = None,
name: str = "",
exclude_headers: List[str] = None,
@ -163,7 +152,7 @@ class Connection:
raise AttributeError("method is not set.")
method = method.upper()
headers = dict() if headers is None else headers
disable_cache = (headers.get("Cache-Control", "").lower() == "no-cache" if disable_cache is None else disable_cache) or kwargs.get("stream", False)
disable_cache = headers.get("Cache-Control", "").lower() == "no-cache" if disable_cache is None else disable_cache
accepted_response_codes = self.ACCEPTED_RESPONSE_CODES if accepted_response_codes is None else accepted_response_codes
current_kwargs = copy.copy(locals())
@ -171,7 +160,6 @@ class Connection:
current_kwargs.update(**kwargs)
parsed_url = urlparse(url)
trace_string = f"{method} {shorten_display_url(url)} \t{'[stream]' if kwargs.get('stream', False) else ''}"
if not raw_headers:
_headers = copy.copy(self.HEADER_VALUES)
@ -187,23 +175,15 @@ class Connection:
request_url = parsed_url.geturl() if not raw_url else url
if name != "" and (not disable_cache or enable_cache_readonly):
if name != "" and not disable_cache:
cached = self.cache.get(name)
if cached is not None:
request_trace(f"{trace_string}\t[cached]")
with responses.RequestsMock() as resp:
additional_info = cached.attribute.additional_info
body = cached.content
if additional_info.get("encoding", None) is not None:
body = body.decode(additional_info["encoding"])
resp.add(
method=method,
url=request_url,
body=body,
body=cached,
)
return requests.request(method=method, url=url, timeout=timeout, headers=headers, **kwargs)
@ -219,9 +199,6 @@ class Connection:
if header in headers:
del headers[header]
if try_count <= 0:
request_trace(trace_string)
r = None
connection_failed = False
try:
@ -231,7 +208,7 @@ class Connection:
pass
self.lock = True
r: requests.Response = self.session.request(method=method, url=url, timeout=timeout, headers=headers, **kwargs)
r: requests.Response = requests.request(method=method, url=url, timeout=timeout, headers=headers, **kwargs)
if r.status_code in accepted_response_codes:
if not disable_cache:
@ -251,10 +228,10 @@ class Connection:
self.lock = False
if r is None:
self.LOGGER.warning(f"{parsed_url.netloc} didn't respond at {url}. ({try_count}-{self.TRIES})")
self.LOGGER.warning(f"{self.HOST.netloc} didn't respond at {url}. ({try_count}-{self.TRIES})")
self.LOGGER.debug("request headers:\n\t"+ "\n\t".join(f"{k}\t=\t{v}" for k, v in headers.items()))
else:
self.LOGGER.warning(f"{parsed_url.netloc} responded wit {r.status_code} at {url}. ({try_count}-{self.TRIES})")
self.LOGGER.warning(f"{self.HOST.netloc} responded wit {r.status_code} at {url}. ({try_count}-{self.TRIES})")
self.LOGGER.debug("request headers:\n\t"+ "\n\t".join(f"{k}\t=\t{v}" for k, v in r.request.headers.items()))
self.LOGGER.debug("response headers:\n\t"+ "\n\t".join(f"{k}\t=\t{v}" for k, v in r.headers.items()))
self.LOGGER.debug(r.content)
@ -320,7 +297,7 @@ class Connection:
name = kwargs.pop("description")
if progress > 0:
headers = kwargs.get("headers", dict())
headers = dict() if headers is None else headers
headers["Range"] = f"bytes={target.size}-"
r = self.request(
@ -369,7 +346,6 @@ class Connection:
if retry:
self.LOGGER.warning(f"Retrying stream...")
accepted_response_codes.add(206)
stream_kwargs["progress"] = progress
return Connection.stream_into(**stream_kwargs)
return DownloadResult()

View File

@ -1,21 +0,0 @@
from dataclasses import dataclass, field
from typing import Set
from ..utils.config import main_settings
from ..utils.enums.album import AlbumType
@dataclass
class FetchOptions:
download_all: bool = False
album_type_blacklist: Set[AlbumType] = field(default_factory=lambda: set(AlbumType(a) for a in main_settings["album_type_blacklist"]))
@dataclass
class DownloadOptions:
download_all: bool = False
album_type_blacklist: Set[AlbumType] = field(default_factory=lambda: set(AlbumType(a) for a in main_settings["album_type_blacklist"]))
download_again_if_found: bool = False
process_audio_if_found: bool = False
process_metadata_if_found: bool = True

View File

@ -1,45 +1,20 @@
from typing import Tuple, Type, Dict, Set, Optional, List
from collections import defaultdict
from pathlib import Path
import re
import logging
import subprocess
from typing import Tuple, Type, Dict, Set
from PIL import Image
from . import FetchOptions, DownloadOptions
from .results import SearchResults
from ..objects import (
DatabaseObject as DataObject,
Collection,
Target,
Source,
Options,
Song,
Album,
Artist,
Label,
)
from ..objects.artwork import ArtworkVariant
from ..audio import write_metadata_to_target, correct_codec
from ..utils import output, BColors
from ..utils.string_processing import fit_to_file_system
from ..utils.config import youtube_settings, main_settings
from ..utils.path_manager import LOCATIONS
from ..utils.enums import SourceType, ALL_SOURCE_TYPES
from ..objects import DatabaseObject, Source
from ..utils.config import youtube_settings
from ..utils.enums.source import SourcePages
from ..utils.support_classes.download_result import DownloadResult
from ..utils.support_classes.query import Query
from ..utils.support_classes.download_result import DownloadResult
from ..utils.exception import MKMissingNameException
from ..utils.exception.download import UrlNotFoundException
from ..utils.shared import DEBUG_PAGES
from ..connection import Connection
from ..pages import Page, EncyclopaediaMetallum, Musify, YouTube, YoutubeMusic, Bandcamp, Genius, INDEPENDENT_DB_OBJECTS
from ..pages import Page, EncyclopaediaMetallum, Musify, YouTube, YoutubeMusic, Bandcamp, INDEPENDENT_DB_OBJECTS
ALL_PAGES: Set[Type[Page]] = {
# EncyclopaediaMetallum,
Genius,
Musify,
YoutubeMusic,
Bandcamp
@ -59,13 +34,6 @@ SHADY_PAGES: Set[Type[Page]] = {
Musify,
}
fetch_map = {
Song: "fetch_song",
Album: "fetch_album",
Artist: "fetch_artist",
Label: "fetch_label",
}
if DEBUG_PAGES:
DEBUGGING_PAGE = Bandcamp
print(f"Only downloading from page {DEBUGGING_PAGE}.")
@ -75,15 +43,10 @@ if DEBUG_PAGES:
class Pages:
def __init__(self, exclude_pages: Set[Type[Page]] = None, exclude_shady: bool = False, download_options: DownloadOptions = None, fetch_options: FetchOptions = None):
self.LOGGER = logging.getLogger("download")
self.download_options: DownloadOptions = download_options or DownloadOptions()
self.fetch_options: FetchOptions = fetch_options or FetchOptions()
def __init__(self, exclude_pages: Set[Type[Page]] = None, exclude_shady: bool = False) -> None:
# initialize all page instances
self._page_instances: Dict[Type[Page], Page] = dict()
self._source_to_page: Dict[SourceType, Type[Page]] = dict()
self._source_to_page: Dict[SourcePages, Type[Page]] = dict()
exclude_pages = exclude_pages if exclude_pages is not None else set()
@ -91,8 +54,7 @@ class Pages:
exclude_pages = exclude_pages.union(SHADY_PAGES)
if not exclude_pages.issubset(ALL_PAGES):
raise ValueError(
f"The excluded pages have to be a subset of all pages: {exclude_pages} | {ALL_PAGES}")
raise ValueError(f"The excluded pages have to be a subset of all pages: {exclude_pages} | {ALL_PAGES}")
def _set_to_tuple(page_set: Set[Type[Page]]) -> Tuple[Type[Page], ...]:
return tuple(sorted(page_set, key=lambda page: page.__name__))
@ -100,283 +62,72 @@ class Pages:
self._pages_set: Set[Type[Page]] = ALL_PAGES.difference(exclude_pages)
self.pages: Tuple[Type[Page], ...] = _set_to_tuple(self._pages_set)
self._audio_pages_set: Set[Type[Page]
] = self._pages_set.intersection(AUDIO_PAGES)
self.audio_pages: Tuple[Type[Page], ...] = _set_to_tuple(
self._audio_pages_set)
self._audio_pages_set: Set[Type[Page]] = self._pages_set.intersection(AUDIO_PAGES)
self.audio_pages: Tuple[Type[Page], ...] = _set_to_tuple(self._audio_pages_set)
for page_type in self.pages:
self._page_instances[page_type] = page_type(
fetch_options=self.fetch_options, download_options=self.download_options)
self._page_instances[page_type] = page_type()
self._source_to_page[page_type.SOURCE_TYPE] = page_type
def _get_page_from_enum(self, source_page: SourceType) -> Page:
if source_page not in self._source_to_page:
return None
return self._page_instances[self._source_to_page[source_page]]
def search(self, query: Query) -> SearchResults:
result = SearchResults()
for page_type in self.pages:
result.add(
page=page_type,
search_result=self._page_instances[page_type].search(
query=query)
search_result=self._page_instances[page_type].search(query=query)
)
return result
def fetch_details(self, data_object: DataObject, stop_at_level: int = 1, **kwargs) -> DataObject:
if not isinstance(data_object, INDEPENDENT_DB_OBJECTS):
return data_object
def fetch_details(self, music_object: DatabaseObject, stop_at_level: int = 1) -> DatabaseObject:
if not isinstance(music_object, INDEPENDENT_DB_OBJECTS):
return music_object
source: Source
for source in data_object.source_collection.get_sources(source_type_sorting={
"only_with_page": True,
}):
new_data_object = self.fetch_from_source(
source=source, stop_at_level=stop_at_level)
if new_data_object is not None:
data_object.merge(new_data_object)
return data_object
def fetch_from_source(self, source: Source, **kwargs) -> Optional[DataObject]:
if not source.has_page:
return None
source_type = source.page.get_source_type(source=source)
if source_type is None:
self.LOGGER.debug(f"Could not determine source type for {source}.")
return None
func = getattr(source.page, fetch_map[source_type])
# fetching the data object and marking it as fetched
data_object: DataObject = func(source=source, **kwargs)
data_object.mark_as_fetched(source.hash_url)
return data_object
def fetch_from_url(self, url: str) -> Optional[DataObject]:
source = Source.match_url(url, ALL_SOURCE_TYPES.MANUAL)
if source is None:
return None
return self.fetch_from_source(source=source)
def _skip_object(self, data_object: DataObject) -> bool:
if isinstance(data_object, Album):
if not self.download_options.download_all and data_object.album_type in self.download_options.album_type_blacklist:
return True
return False
def _fetch_artist_artwork(self, artist: Artist, naming: dict):
naming: Dict[str, List[str]] = defaultdict(list, naming)
naming["artist"].append(artist.name)
naming["label"].extend(
[l.title_value for l in artist.label_collection])
# removing duplicates from the naming, and process the strings
for key, value in naming.items():
# https://stackoverflow.com/a/17016257
naming[key] = list(dict.fromkeys(value))
artwork_collection: ArtworkCollection = artist.artwork
artwork_collection.compile()
for image_number, artwork in enumerate(artwork_collection):
for artwork_variant in artwork.variants:
naming["image_number"] = [str(image_number)]
target = Target(
relative_to_music_dir=True,
file_path=Path(self._parse_path_template(
main_settings["artist_artwork_path"], naming=naming))
)
if not target.file_path.parent.exists():
target.create_path()
subprocess.Popen(["gio", "set", target.file_path.parent, "metadata::custom-icon", "file://"+str(target.file_path)])
with Image.open(artwork_variant.target.file_path) as img:
img.save(target.file_path, main_settings["image_format"])
artwork_variant.target = Target
def download(self, data_object: DataObject, genre: str, **kwargs) -> DownloadResult:
# fetch the given object
self.fetch_details(data_object)
output(
f"\nDownloading {data_object.option_string}...", color=BColors.BOLD)
# fetching all parent objects (e.g. if you only download a song)
if not kwargs.get("fetched_upwards", False):
to_fetch: List[DataObject] = [data_object]
while len(to_fetch) > 0:
new_to_fetch = []
for d in to_fetch:
if self._skip_object(d):
for source_page in music_object.source_collection.source_pages:
if source_page not in self._source_to_page:
continue
self.fetch_details(d)
page_type = self._source_to_page[source_page]
for c in d.get_parent_collections():
new_to_fetch.extend(c)
if page_type in self._pages_set:
music_object.merge(self._page_instances[page_type].fetch_details(music_object=music_object, stop_at_level=stop_at_level))
to_fetch = new_to_fetch
return music_object
kwargs["fetched_upwards"] = True
def is_downloadable(self, music_object: DatabaseObject) -> bool:
_page_types = set(self._source_to_page)
for src in music_object.source_collection.source_pages:
if src in self._source_to_page:
_page_types.add(self._source_to_page[src])
naming = kwargs.get("naming", {
"genre": [genre],
"audio_format": [main_settings["audio_format"]],
"image_format": [main_settings["image_format"]]
})
audio_pages = self._audio_pages_set.intersection(_page_types)
return len(audio_pages) > 0
# download artist artwork
if isinstance(data_object, Artist):
self._fetch_artist_artwork(artist=data_object, naming=naming)
def download(self, music_object: DatabaseObject, genre: str, download_all: bool = False, process_metadata_anyway: bool = False) -> DownloadResult:
if not isinstance(music_object, INDEPENDENT_DB_OBJECTS):
return DownloadResult(error_message=f"{type(music_object).__name__} can't be downloaded.")
# download all children
download_result: DownloadResult = DownloadResult()
for c in data_object.get_child_collections():
for d in c:
if self._skip_object(d):
continue
self.fetch_details(music_object)
download_result.merge(self.download(d, genre, **kwargs))
_page_types = set(self._source_to_page)
for src in music_object.source_collection.source_pages:
if src in self._source_to_page:
_page_types.add(self._source_to_page[src])
# actually download if the object is a song
if isinstance(data_object, Song):
"""
TODO
add the traced artist and album to the naming.
I am able to do that, because duplicate values are removed later on.
"""
audio_pages = self._audio_pages_set.intersection(_page_types)
self._download_song(data_object, naming=naming)
for download_page in audio_pages:
return self._page_instances[download_page].download(music_object=music_object, genre=genre, download_all=download_all, process_metadata_anyway=process_metadata_anyway)
return download_result
return DownloadResult(error_message=f"No audio source has been found for {music_object}.")
def _extract_fields_from_template(self, path_template: str) -> Set[str]:
return set(re.findall(r"{([^}]+)}", path_template))
def _parse_path_template(self, path_template: str, naming: Dict[str, List[str]]) -> str:
field_names: Set[str] = self._extract_fields_from_template(
path_template)
for field in field_names:
if len(naming[field]) == 0:
raise MKMissingNameException(f"Missing field for {field}.")
path_template = path_template.replace(
f"{{{field}}}", naming[field][0])
return path_template
def _download_song(self, song: Song, naming: dict) -> DownloadOptions:
"""
TODO
Search the song in the file system.
"""
r = DownloadResult(total=1)
# pre process the data recursively
song.compile()
# manage the naming
naming: Dict[str, List[str]] = defaultdict(list, naming)
naming["song"].append(song.title_value)
naming["isrc"].append(song.isrc)
naming["album"].extend(a.title_value for a in song.album_collection)
naming["album_type"].extend(
a.album_type.value for a in song.album_collection)
naming["artist"].extend(a.name for a in song.artist_collection)
naming["artist"].extend(a.name for a in song.feature_artist_collection)
for a in song.album_collection:
naming["label"].extend([l.title_value for l in a.label_collection])
# removing duplicates from the naming, and process the strings
for key, value in naming.items():
# https://stackoverflow.com/a/17016257
naming[key] = list(dict.fromkeys(value))
song.genre = naming["genre"][0]
# manage the targets
tmp: Target = Target.temp(file_extension=main_settings["audio_format"])
song.target_collection.append(Target(
relative_to_music_dir=True,
file_path=Path(
self._parse_path_template(
main_settings["download_path"], naming=naming),
self._parse_path_template(
main_settings["download_file"], naming=naming),
)
))
for target in song.target_collection:
if target.exists:
output(
f'{target.file_path} {BColors.OKGREEN.value}[already exists]', color=BColors.GREY)
r.found_on_disk += 1
if not self.download_options.download_again_if_found:
target.copy_content(tmp)
else:
target.create_path()
output(f'{target.file_path}', color=BColors.GREY)
# this streams from every available source until something succeeds, setting the skip intervals to the values of the according source
used_source: Optional[Source] = None
skip_intervals: List[Tuple[float, float]] = []
for source in song.source_collection.get_sources(source_type_sorting={
"only_with_page": True,
"sort_key": lambda page: page.download_priority,
"reverse": True,
}):
if tmp.exists:
break
used_source = source
streaming_results = source.page.download_song_to_target(
source=source, target=tmp, desc="download")
skip_intervals = source.page.get_skip_intervals(
song=song, source=source)
# if something has been downloaded but it somehow failed, delete the file
if streaming_results.is_fatal_error and tmp.exists:
tmp.delete()
# if everything went right, the file should exist now
if not tmp.exists:
if used_source is None:
r.error_message = f"No source found for {song.option_string}."
else:
r.error_message = f"Something went wrong downloading {song.option_string}."
return r
# post process the audio
found_on_disk = used_source is None
if not found_on_disk or self.download_options.process_audio_if_found:
correct_codec(target=tmp, skip_intervals=skip_intervals)
r.sponsor_segments = len(skip_intervals)
if used_source is not None:
used_source.page.post_process_hook(song=song, temp_target=tmp)
if not found_on_disk or self.download_options.process_metadata_if_found:
write_metadata_to_target(
metadata=song.metadata, target=tmp, song=song)
# copy the tmp target to the final locations
for target in song.target_collection:
tmp.copy_content(target)
tmp.delete()
return r
def fetch_url(self, url: str, stop_at_level: int = 2) -> Tuple[Type[Page], DataObject]:
source = Source.match_url(url, ALL_SOURCE_TYPES.MANUAL)
def fetch_url(self, url: str, stop_at_level: int = 2) -> Tuple[Type[Page], DatabaseObject]:
source = Source.match_url(url, SourcePages.MANUAL)
if source is None:
raise UrlNotFoundException(url=url)
_actual_page = self._source_to_page[source.source_type]
_actual_page = self._source_to_page[source.page_enum]
return _actual_page, self._page_instances[_actual_page].fetch_object_from_source(source=source, stop_at_level=stop_at_level)

View File

@ -2,6 +2,7 @@ from typing import Tuple, Type, Dict, List, Generator, Union
from dataclasses import dataclass
from ..objects import DatabaseObject
from ..utils.enums.source import SourcePages
from ..pages import Page, EncyclopaediaMetallum, Musify
@ -12,35 +13,31 @@ class Option:
class Results:
def __init__(self, max_items_per_page: int = 10, **kwargs) -> None:
def __init__(self) -> None:
self._by_index: Dict[int, DatabaseObject] = dict()
self._page_by_index: Dict[int: Type[Page]] = dict()
self.max_items_per_page = max_items_per_page
def __iter__(self) -> Generator[DatabaseObject, None, None]:
for option in self.formatted_generator():
for option in self.formated_generator():
if isinstance(option, Option):
yield option.music_object
def formatted_generator(self) -> Generator[Union[Type[Page], Option], None, None]:
def formated_generator(self, max_items_per_page: int = 10) -> Generator[Union[Type[Page], Option], None, None]:
self._by_index = dict()
self._page_by_index = dict()
def __len__(self) -> int:
return max(self._by_index.keys())
def __getitem__(self, index: int):
return self._by_index[index]
def get_music_object_by_index(self, index: int) -> Tuple[Type[Page], DatabaseObject]:
# if this throws a key error, either the formatted generator needs to be iterated, or the option doesn't exist.
return self._page_by_index[index], self._by_index[index]
class SearchResults(Results):
def __init__(
self,
pages: Tuple[Type[Page], ...] = None,
**kwargs,
pages: Tuple[Type[Page], ...] = None
) -> None:
super().__init__(**kwargs)
super().__init__()
self.pages = pages or []
# this would initialize a list for every page, which I don't think I want
@ -58,11 +55,8 @@ class SearchResults(Results):
def get_page_results(self, page: Type[Page]) -> "PageResults":
return PageResults(page, self.results.get(page, []))
def __len__(self) -> int:
return sum(min(self.max_items_per_page, len(results)) for results in self.results.values())
def formatted_generator(self):
super().formatted_generator()
def formated_generator(self, max_items_per_page: int = 10):
super().formated_generator()
i = 0
for page in self.results:
@ -76,37 +70,19 @@ class SearchResults(Results):
i += 1
j += 1
if j >= self.max_items_per_page:
if j >= max_items_per_page:
break
class GoToResults(Results):
def __init__(self, results: List[DatabaseObject], **kwargs):
self.results: List[DatabaseObject] = results
super().__init__(**kwargs)
def __getitem__(self, index: int):
return self.results[index]
def __len__(self) -> int:
return len(self.results)
def formatted_generator(self):
yield from (Option(i, o) for i, o in enumerate(self.results))
class PageResults(Results):
def __init__(self, page: Type[Page], results: List[DatabaseObject], **kwargs) -> None:
super().__init__(**kwargs)
def __init__(self, page: Type[Page], results: List[DatabaseObject]) -> None:
super().__init__()
self.page: Type[Page] = page
self.results: List[DatabaseObject] = results
def formatted_generator(self, max_items_per_page: int = 10):
super().formatted_generator()
def formated_generator(self, max_items_per_page: int = 10):
super().formated_generator()
i = 0
yield self.page
@ -116,6 +92,3 @@ class PageResults(Results):
self._by_index[i] = option
self._page_by_index[i] = self.page
i += 1
def __len__(self) -> int:
return len(self.results)

View File

@ -1,16 +1,27 @@
from typing_extensions import TypeVar
from .artwork import ArtworkCollection
from .collection import Collection
from .contact import Contact
from .country import Country
from .formatted_text import FormattedText
from .metadata import ID3Timestamp
from .metadata import Mapping as ID3Mapping
from .metadata import Metadata
from .option import Options
from .parents import OuterProxy
from .song import Album, Artist, Label, Lyrics, Song, Target
from .source import Source, SourceType
DatabaseObject = OuterProxy
from .metadata import Metadata, Mapping as ID3Mapping, ID3Timestamp
from .source import Source, SourcePages, SourceTypes
from .song import (
Song,
Album,
Artist,
Target,
Lyrics,
Label
)
from .formatted_text import FormattedText
from .collection import Collection
from .country import Country
from .contact import Contact
from .parents import OuterProxy
from .artwork import Artwork
DatabaseObject = TypeVar('T', bound=OuterProxy)

View File

@ -1,243 +1,59 @@
from __future__ import annotations
from copy import copy
from dataclasses import dataclass, field
from functools import cached_property
from typing import Dict, List, Optional, Set, Tuple, Type, TypedDict, Union
from typing import List, Optional, Dict, Tuple, Type, Union, TypedDict
from ..connection import Connection
from ..utils import create_dataclass_instance, custom_hash
from ..utils.config import main_settings
from ..utils.enums import PictureType
from ..utils.string_processing import hash_url, unify
from .collection import Collection
from .metadata import ID3Timestamp
from .metadata import Mapping as id3Mapping
from .metadata import Metadata
from .metadata import (
Mapping as id3Mapping,
ID3Timestamp,
Metadata
)
from ..utils.string_processing import unify, hash_url
from .parents import OuterProxy as Base
from .target import Target
from PIL import Image
import imagehash
artwork_connection: Connection = Connection(module="artwork")
from ..utils.config import main_settings
@dataclass
class ArtworkVariant:
class ArtworkVariant(TypedDict):
url: str
width: Optional[int] = None
heigth: Optional[int] = None
image_format: Optional[str] = None
width: int
height: int
deviation: float
def __hash__(self) -> int:
return custom_hash(self.url)
def __eq__(self, other: ArtworkVariant) -> bool:
return hash(self) == hash(other)
class Artwork:
def __init__(self, *variants: List[ArtworkVariant]) -> None:
self._variant_mapping: Dict[str, ArtworkVariant] = {}
def __contains__(self, other: str) -> bool:
return custom_hash(other) == hash(self.url)
for variant in variants:
self.append(**variant)
def __merge__(self, other: ArtworkVariant) -> None:
for key, value in other.__dict__.items():
if value is None:
continue
@staticmethod
def _calculate_deviation(*dimensions: List[int]) -> float:
return sum(abs(d - main_settings["preferred_artwork_resolution"]) for d in dimensions) / len(dimensions)
if getattr(self, key) is None:
setattr(self, key, value)
@cached_property
def target(self) -> Target:
return Target.temp()
def fetch(self) -> None:
global artwork_connection
r = artwork_connection.get(self.url, name=hash_url(self.url))
if r is None:
def append(self, url: str, width: int = main_settings["preferred_artwork_resolution"], height: int = main_settings["preferred_artwork_resolution"], **kwargs) -> None:
if url is None:
return
self.target.raw_content = r.content
@dataclass
class Artwork:
variants: List[ArtworkVariant] = field(default_factory=list)
artwork_type: PictureType = PictureType.OTHER
def search_variant(self, url: str) -> Optional[ArtworkVariant]:
if url is None:
return None
for variant in self.variants:
if url in variant:
return variant
return None
def __contains__(self, other: str) -> bool:
return self.search_variant(other) is not None
def add_data(self, **kwargs) -> None:
variant = self.search_variant(kwargs.get("url"))
if variant is None:
variant, kwargs = create_dataclass_instance(ArtworkVariant, kwargs)
self.variants.append(variant)
variant.__dict__.update(kwargs)
self._variant_mapping[hash_url(url=url)] = {
"url": url,
"width": width,
"height": height,
"deviation": self._calculate_deviation(width, height),
}
@property
def url(self) -> Optional[str]:
if len(self.variants) <= 0:
def best_variant(self) -> ArtworkVariant:
if len(self._variant_mapping.keys()) <= 0:
return None
return self.variants[0].url
def fetch(self) -> None:
for variant in self.variants:
variant.fetch()
class ArtworkCollection:
"""
Stores all the images/artworks for one data object.
There could be duplicates before calling ArtworkCollection.compile()
_this is called before one object is downloaded automatically._
"""
artwork_type: PictureType = PictureType.OTHER
def __init__(
self,
*data: List[Artwork],
parent_artworks: Set[ArtworkCollection] = None,
crop_images: bool = True,
) -> None:
# this is used for the song artwork, to fall back to the song artwork
self.parent_artworks: Set[ArtworkCollection] = parent_artworks or set()
self.crop_images: bool = crop_images
self._data = []
self.extend(data)
def search_artwork(self, url: str) -> Optional[ArtworkVariant]:
for artwork in self._data:
if url in artwork:
return artwork
return None
def __contains__(self, other: str) -> bool:
return self.search_artwork(other) is not None
def _create_new_artwork(self, **kwargs) -> Tuple[Artwork, dict]:
kwargs["artwork_type"] = kwargs.get("artwork_type", self.artwork_type)
return create_dataclass_instance(Artwork, dict(**kwargs))
def add_data(self, url: str, **kwargs) -> Artwork:
kwargs["url"] = url
artwork = self.search_artwork(url)
if artwork is None:
artwork, kwargs = self._create_new_artwork(**kwargs)
self._data.append(artwork)
artwork.add_data(**kwargs)
return artwork
def append(self, value: Union[Artwork, ArtworkVariant, dict], **kwargs):
"""
You can append the types Artwork, ArtworkVariant or dict
the best option would be to use Artwork and avoid the other options.
"""
if isinstance(value, dict):
kwargs.update(value)
value, kwargs = create_dataclass_instance(ArtworkVariant, kwargs)
if isinstance(value, ArtworkVariant):
kwargs["variants"] = [value]
value, kwargs = create_dataclass_instance(Artwork, kwargs)
if isinstance(value, Artwork):
self._data.append(value)
return
def extend(self, values: List[Union[Artwork, ArtworkVariant, dict]], **kwargs):
for value in values:
self.append(value, **kwargs)
def compile(self, **kwargs) -> None:
"""
This will make the artworks ready for download and delete duplicates.
"""
artwork_hashes: list = list()
artwork_urls: list = list()
for artwork in self._data:
index = 0
for artwork_variant in artwork.variants:
r = artwork_connection.get(
url=artwork_variant.url,
name=artwork_variant.url,
)
if artwork_variant.url in artwork_urls:
artwork.variants.pop(index)
continue
artwork_urls.append(artwork_variant.url)
target: Target = artwork_variant.target
with target.open("wb") as f:
f.write(r.content)
with Image.open(target.file_path) as img:
# https://stackoverflow.com/a/59476938/16804841
if img.mode != 'RGB':
img = img.convert('RGB')
try:
image_hash = imagehash.crop_resistant_hash(img)
except Exception as e:
continue
if image_hash in artwork_hashes:
artwork.variants.pop(index)
target.delete()
continue
artwork_hashes.append(image_hash)
width, height = img.size
if width != height:
if width > height:
img = img.crop((width // 2 - height // 2, 0, width // 2 + height // 2, height))
else:
img = img.crop((0, height // 2 - width // 2, width, height // 2 + width // 2))
# resize the image to the preferred resolution
img.thumbnail((main_settings["preferred_artwork_resolution"], main_settings["preferred_artwork_resolution"]))
index =+ 1
def __merge__(self, other: ArtworkCollection, **kwargs) -> None:
self.parent_artworks.update(other.parent_artworks)
for other_artwork in other._data:
for other_variant in other_artwork.variants:
if self.__contains__(other_variant.url):
continue
self.append(ArtworkVariant(other_variant.url))
def __hash__(self) -> int:
return id(self)
def __iter__(self) -> Generator[Artwork, None, None]:
yield from self._data
def get_urls(self) -> Generator[str, None, None]:
yield from (artwork.url for artwork in self._data if artwork.url is not None)
return min(self._variant_mapping.values(), key=lambda x: x["deviation"])
def __merge__(self, other: Artwork, override: bool = False) -> None:
for key, value in other._variant_mapping.items():
if key not in self._variant_mapping or override:
self._variant_mapping[key] = value
def __eq__(self, other: Artwork) -> bool:
return any(a == b for a, b in zip(self._variant_mapping.keys(), other._variant_mapping.keys()))

View File

@ -0,0 +1,110 @@
from collections import defaultdict
from typing import Dict, List, Optional
import weakref
from .parents import DatabaseObject
"""
This is a cache for the objects, that et pulled out of the database.
This is necessary, to not have duplicate objects with the same id.
Using a cache that maps the ojects to their id has multiple benefits:
- if you modify the object at any point, all objects with the same id get modified *(copy by reference)*
- less ram usage
- to further decrease ram usage I only store weak refs and not a strong reference, for the gc to still work
"""
class ObjectCache:
"""
ObjectCache is a cache for the objects retrieved from a database.
It maps each object to its id and uses weak references to manage its memory usage.
Using a cache for these objects provides several benefits:
- Modifying an object updates all objects with the same id (due to copy by reference)
- Reduced memory usage
:attr object_to_id: Dictionary that maps DatabaseObjects to their id.
:attr weakref_map: Dictionary that uses weak references to DatabaseObjects as keys and their id as values.
:method exists: Check if a DatabaseObject already exists in the cache.
:method append: Add a DatabaseObject to the cache if it does not already exist.
:method extent: Add a list of DatabaseObjects to the cache.
:method remove: Remove a DatabaseObject from the cache by its id.
:method get: Retrieve a DatabaseObject from the cache by its id. """
object_to_id: Dict[str, DatabaseObject]
weakref_map: Dict[weakref.ref, str]
def __init__(self) -> None:
self.object_to_id = dict()
self.weakref_map = defaultdict()
def exists(self, database_object: DatabaseObject) -> bool:
"""
Check if a DatabaseObject with the same id already exists in the cache.
:param database_object: The DatabaseObject to check for.
:return: True if the DatabaseObject exists, False otherwise.
"""
if database_object.dynamic:
return True
return database_object.id in self.object_to_id
def on_death(self, weakref_: weakref.ref) -> None:
"""
Callback function that gets triggered when the reference count of a DatabaseObject drops to 0.
This function removes the DatabaseObject from the cache.
:param weakref_: The weak reference of the DatabaseObject that has been garbage collected.
"""
data_id = self.weakref_map.pop(weakref_)
self.object_to_id.pop(data_id)
def get_weakref(self, database_object: DatabaseObject) -> weakref.ref:
return weakref.ref(database_object, self.on_death)
def append(self, database_object: DatabaseObject) -> bool:
"""
Add a DatabaseObject to the cache.
:param database_object: The DatabaseObject to add to the cache.
:return: True if the DatabaseObject already exists in the cache, False otherwise.
"""
if self.exists(database_object):
return True
self.weakref_map[weakref.ref(database_object, self.on_death)] = database_object.id
self.object_to_id[database_object.id] = database_object
return False
def extent(self, database_object_list: List[DatabaseObject]):
"""
adjacent to the extent method of list, this appends n Object
"""
for database_object in database_object_list:
self.append(database_object)
def remove(self, _id: str):
"""
Remove a DatabaseObject from the cache.
:param _id: The id of the DatabaseObject to remove from the cache.
"""
data = self.object_to_id.get(_id)
if data:
self.weakref_map.pop(weakref.ref(data))
self.object_to_id.pop(_id)
def __getitem__(self, item) -> Optional[DatabaseObject]:
"""
this returns the data obj
:param item: the id of the music object
:return:
"""
return self.object_to_id.get(item)
def get(self, _id: str) -> Optional[DatabaseObject]:
return self.__getitem__(_id)

View File

@ -1,50 +1,20 @@
from __future__ import annotations
import copy
from collections import defaultdict
from dataclasses import dataclass
from typing import (Any, Callable, Dict, Generator, Generic, Iterable,
Iterator, List, Optional, Set, Tuple, TypeVar, Union)
from ..utils import BColors, object_trace, output
from .parents import InnerData, OuterProxy
from typing import TypeVar, Generic, Dict, Optional, Iterable, List, Iterator, Tuple, Generator, Union, Any
from .parents import OuterProxy
from ..utils import object_trace
T = TypeVar('T', bound=OuterProxy)
@dataclass
class AppendHookArguments:
"""
This class is used to store the arguments for the append hook.
The best explanation is with an examples:
```
album = Album()
song = Song()
album.song_collection.append(song)
```
In this case, the append hook is triggered with the following arguments:
```
AppendHookArguments(
collection=album.song_collection,
new_object=song,
collection_root_objects=[album]
)
```
"""
collection: Collection
new_object: T
collection_root_objects: Set[InnerData]
class Collection(Generic[T]):
__is_collection__ = True
_data: List[T]
_indexed_from_id: Dict[int, Dict[str, Any]]
_indexed_values: Dict[str, Dict[Any, T]]
_indexed_values: Dict[str, set]
_indexed_to_objects: Dict[any, list]
shallow_list = property(fget=lambda self: self.data)
@ -54,7 +24,6 @@ class Collection(Generic[T]):
sync_on_append: Dict[str, Collection] = None,
append_object_to_attribute: Dict[str, T] = None,
extend_object_to_attribute: Dict[str, Collection] = None,
append_callbacks: Set[Callable[[AppendHookArguments], None]] = None,
) -> None:
self._collection_for: dict = dict()
@ -67,9 +36,8 @@ class Collection(Generic[T]):
self.append_object_to_attribute: Dict[str, T] = append_object_to_attribute or {}
self.extend_object_to_attribute: Dict[str, Collection[T]] = extend_object_to_attribute or {}
self.sync_on_append: Dict[str, Collection] = sync_on_append or {}
self.pull_from: List[Collection] = []
self.push_to: List[Collection] = []
self.append_callbacks: Set[Callable[[AppendHookArguments], None]] = append_callbacks or set()
self._id_to_index_values: Dict[int, set] = defaultdict(set)
# This is to cleanly unmap previously mapped items by their id
self._indexed_from_id: Dict[int, Dict[str, Any]] = defaultdict(dict)
@ -78,18 +46,10 @@ class Collection(Generic[T]):
self.extend(data)
def __hash__(self) -> int:
return id(self)
@property
def collection_names(self) -> List[str]:
return list(set(self._collection_for.values()))
def __repr__(self) -> str:
return f"Collection({' | '.join(self.collection_names)} {id(self)})"
return f"Collection({id(self)})"
def _map_element(self, __object: T, no_unmap: bool = False, **kwargs):
if not no_unmap:
def _map_element(self, __object: T, from_map: bool = False):
self._unmap_element(__object.id)
self._indexed_from_id[__object.id]["id"] = __object.id
@ -114,136 +74,73 @@ class Collection(Generic[T]):
del self._indexed_from_id[obj_id]
def _remap(self):
# reinitialize the mapping to clean it without time consuming operations
self._indexed_from_id: Dict[int, Dict[str, Any]] = defaultdict(dict)
self._indexed_values: Dict[str, Dict[Any, T]] = defaultdict(dict)
for e in self._data:
self._map_element(e, no_unmap=True)
def _find_object(self, __object: T, **kwargs) -> Optional[T]:
self._remap()
if __object.id in self._indexed_from_id:
return self._indexed_values["id"][__object.id]
def _find_object(self, __object: T) -> Optional[T]:
for name, value in __object.indexing_values:
if value in self._indexed_values[name]:
return self._indexed_values[name][value]
return None
def _append_new_object(self, other: T, **kwargs):
"""
This function appends the other object to the current collection.
This only works if not another object, which represents the same real life object exists in the collection.
"""
self._data.append(other)
other._inner._is_in_collection.add(self)
for attribute, a in self.sync_on_append.items():
# syncing two collections by reference
b = other.__getattribute__(attribute)
if a is b:
continue
object_trace(f"Syncing [{a}] = [{b}]")
b_data = b.data.copy()
b_collection_for = b._collection_for.copy()
del b
for synced_with, key in b_collection_for.items():
synced_with.__setattr__(key, a)
a._collection_for[synced_with] = key
a.extend(b_data, **kwargs)
# all of the existing hooks to get the defined datastructures
for collection_attribute, generator in self.extend_object_to_attribute.items():
other.__getattribute__(collection_attribute).extend(generator, **kwargs)
for attribute, new_object in self.append_object_to_attribute.items():
other.__getattribute__(attribute).append(new_object, **kwargs)
append_hook_args = AppendHookArguments(
collection=self,
new_object=other,
collection_root_objects=self._collection_for.keys(),
)
for callback in self.append_callbacks:
callback(append_hook_args)
def append(self, other: Optional[T], **kwargs):
def append(self, __object: Optional[T], already_is_parent: bool = False, from_map: bool = False):
"""
If an object, that represents the same entity exists in a relevant collection,
merge into this object. (and remap)
Else append to this collection.
:param other:
:param __object:
:param already_is_parent:
:param from_map:
:return:
"""
if other is None:
return
if not other._inner._has_data:
return
if other.id in self._indexed_from_id:
if __object is None:
return
object_trace(f"Appending {other.option_string} to {self}")
existing_object = self._find_object(__object)
if existing_object is None:
# append
self._data.append(__object)
self._map_element(__object)
for collection_attribute, child_collection in self.extend_object_to_attribute.items():
__object.__getattribute__(collection_attribute).extend(child_collection)
for attribute, new_object in self.append_object_to_attribute.items():
__object.__getattribute__(attribute).append(new_object)
# only modify collections if the object actually has been appended
for attribute, a in self.sync_on_append.items():
b = __object.__getattribute__(attribute)
object_trace(f"Syncing [{a}{id(a)}] = [{b}{id(b)}]")
data_to_extend = b.data
a._collection_for.update(b._collection_for)
for synced_with, key in b._collection_for.items():
synced_with.__setattr__(key, a)
a.extend(data_to_extend)
# switching collection in the case of push to
for c in self.push_to:
r = c._find_object(other)
if r is not None:
# output("found push to", r, other, c, self, color=BColors.RED, sep="\t")
return c.append(other, **kwargs)
for c in self.pull_from:
r = c._find_object(other)
if r is not None:
# output("found pull from", r, other, c, self, color=BColors.RED, sep="\t")
c.remove(r, existing=r, **kwargs)
existing = self._find_object(other)
if existing is None:
self._append_new_object(other, **kwargs)
else:
existing.merge(other, **kwargs)
def remove(self, *other_list: List[T], silent: bool = False, existing: Optional[T] = None, remove_from_other_collection=True, **kwargs):
other: T
for other in other_list:
existing: Optional[T] = existing or self._indexed_values["id"].get(other.id, None)
if existing is None:
if not silent:
raise ValueError(f"Object {other} not found in {self}")
return other
if remove_from_other_collection:
for c in copy.copy(other._inner._is_in_collection):
c.remove(other, silent=True, remove_from_other_collection=False, **kwargs)
other._inner._is_in_collection = set()
else:
self._data.remove(existing)
self._unmap_element(existing)
def contains(self, __object: T) -> bool:
return self._find_object(__object) is not None
def extend(self, other_collections: Optional[Generator[T, None, None]], **kwargs):
if other_collections is None:
# merge only if the two objects are not the same
if existing_object.id == __object.id:
return
for other_object in other_collections:
self.append(other_object, **kwargs)
old_id = existing_object.id
existing_object.merge(__object)
if existing_object.id != old_id:
self._unmap_element(old_id)
self._map_element(existing_object)
def extend(self, __iterable: Optional[Generator[T, None, None]]):
if __iterable is None:
return
for __object in __iterable:
self.append(__object)
@property
def data(self) -> List[T]:
@ -259,20 +156,8 @@ class Collection(Generic[T]):
def __iter__(self) -> Iterator[T]:
yield from self._data
def __merge__(self, other: Collection, **kwargs):
object_trace(f"merging {str(self)} | {str(other)}")
self.extend(other, **kwargs)
def __merge__(self, __other: Collection, override: bool = False):
self.extend(__other)
def __getitem__(self, item: int):
return self._data[item]
def get(self, item: int, default = None):
if item >= len(self._data):
return default
return self._data[item]
def __eq__(self, other: Collection) -> bool:
if self.empty and other.empty:
return True
return self._data == other._data

View File

@ -32,27 +32,14 @@ class FormattedText:
if self.is_empty and other.is_empty:
return True
return self.html == other.html
return self.doc == other.doc
@property
def markdown(self) -> str:
return md(self.html).strip()
@markdown.setter
def markdown(self, value: str) -> None:
self.html = mistune.markdown(value)
@property
def plain(self) -> str:
md = self.markdown
return md.replace("\n\n", "\n")
@plain.setter
def plain(self, value: str) -> None:
self.html = mistune.markdown(plain_to_markdown(value))
def __str__(self) -> str:
return self.markdown
plaintext = plain
plaintext = markdown

View File

@ -34,6 +34,6 @@ class Lyrics(OuterProxy):
@property
def metadata(self) -> Metadata:
return Metadata({
id3Mapping.UNSYNCED_LYRICS: [self.text.plaintext]
id3Mapping.UNSYNCED_LYRICS: [self.text.markdown]
})

View File

@ -92,7 +92,7 @@ class Mapping(Enum):
key = attribute.value
if key[0] == 'T':
# a text field
# a text fiel
return cls.get_text_instance(key, value)
if key[0] == "W":
# an url field
@ -355,12 +355,7 @@ class Metadata:
return None
list_data = self.id3_dict[field]
#correct duplications
correct_list_data = list()
for data in list_data:
if data not in correct_list_data:
correct_list_data.append(data)
list_data = correct_list_data
# convert for example the time objects to timestamps
for i, element in enumerate(list_data):
# for performances sake I don't do other checks if it is already the right type
@ -400,5 +395,6 @@ class Metadata:
"""
# set the tagging timestamp to the current time
self.__setitem__(Mapping.TAGGING_TIME, [ID3Timestamp.now()])
for field in self.id3_dict:
yield self.get_mutagen_object(field)

View File

@ -8,11 +8,10 @@ from typing import Optional, Dict, Tuple, List, Type, Generic, Any, TypeVar, Set
from pathlib import Path
import inspect
from .source import SourceCollection
from .metadata import Metadata
from ..utils import get_unix_time, object_trace, generate_id
from ..utils import get_unix_time, object_trace
from ..utils.config import logging_settings, main_settings
from ..utils.shared import HIGHEST_ID, DEBUG_PRINT_ID
from ..utils.shared import HIGHEST_ID
from ..utils.hacking import MetaClass
LOGGER = logging_settings["object_logger"]
@ -30,17 +29,9 @@ class InnerData:
"""
_refers_to_instances: set = None
_is_in_collection: set = None
_has_data: bool = False
"""
Attribute versions keep track, of if the attribute has been changed.
"""
def __init__(self, object_type, **kwargs):
self._refers_to_instances = set()
self._is_in_collection = set()
self._fetched_from: dict = {}
# initialize the default values
@ -51,39 +42,21 @@ class InnerData:
for key, value in kwargs.items():
if hasattr(value, "__is_collection__"):
value._collection_for[self] = key
self.__setattr__(key, value)
if self._has_data:
continue
def __setattr__(self, key: str, value):
if self._has_data or not hasattr(self, "_default_values"):
return super().__setattr__(key, value)
super().__setattr__("_has_data", not (key in self._default_values and self._default_values[key] == value))
return super().__setattr__(key, value)
def __hash__(self):
return self.id
def __merge__(self, __other: InnerData, **kwargs):
def __merge__(self, __other: InnerData, override: bool = False):
"""
:param __other:
:param override:
:return:
"""
self._fetched_from.update(__other._fetched_from)
self._is_in_collection.update(__other._is_in_collection)
for key, value in __other.__dict__.copy().items():
if key.startswith("_"):
continue
if hasattr(value, "__is_collection__") and key in self.__dict__:
self.__getattribute__(key).__merge__(value, **kwargs)
continue
# just set the other value if self doesn't already have it
if key not in self.__dict__ or (key in self.__dict__ and self.__dict__[key] == self._default_values.get(key)):
self.__setattr__(key, value)
@ -91,8 +64,13 @@ class InnerData:
# if the object of value implemented __merge__, it merges
existing = self.__getattribute__(key)
if hasattr(existing, "__merge__"):
existing.__merge__(value, **kwargs)
if hasattr(type(existing), "__merge__"):
existing.__merge__(value, override)
continue
# override the existing value if requested
if override:
self.__setattr__(key, value)
class OuterProxy:
@ -100,14 +78,14 @@ class OuterProxy:
Wraps the inner data, and provides apis, to naturally access those values.
"""
source_collection: SourceCollection
_default_factories: dict = {"source_collection": SourceCollection}
_default_factories: dict = {}
_outer_attribute: Set[str] = {"options", "metadata", "indexing_values", "option_string"}
DOWNWARDS_COLLECTION_STRING_ATTRIBUTES = tuple()
UPWARDS_COLLECTION_STRING_ATTRIBUTES = tuple()
TITEL = "id"
def __init__(self, _id: int = None, dynamic: bool = False, **kwargs):
_automatic_id: bool = False
@ -116,7 +94,7 @@ class OuterProxy:
generates a random integer id
the range is defined in the config
"""
_id = generate_id()
_id = random.randint(0, HIGHEST_ID)
_automatic_id = True
kwargs["automatic_id"] = _automatic_id
@ -138,7 +116,7 @@ class OuterProxy:
self._inner: InnerData = InnerData(type(self), **kwargs)
self._inner._refers_to_instances.add(self)
object_trace(f"creating {type(self).__name__} [{self.option_string}]")
object_trace(f"creating {type(self).__name__} [{self.title_string}]")
self.__init_collections__()
@ -195,18 +173,18 @@ class OuterProxy:
def __eq__(self, other: Any):
return self.__hash__() == other.__hash__()
def merge(self, __other: Optional[OuterProxy], **kwargs):
def merge(self, __other: Optional[OuterProxy], override: bool = False):
"""
1. merges the data of __other in self
2. replaces the data of __other with the data of self
:param __other:
:param override:
:return:
"""
if __other is None:
return
a_id = self.id
a = self
b = __other
@ -218,7 +196,7 @@ class OuterProxy:
if len(b._inner._refers_to_instances) > len(a._inner._refers_to_instances):
a, b = b, a
object_trace(f"merging {a.option_string} | {b.option_string}")
object_trace(f"merging {type(a).__name__} [{a.title_string} | {a.id}] with {type(b).__name__} [{b.title_string} | {b.id}]")
old_inner = b._inner
@ -226,13 +204,11 @@ class OuterProxy:
instance._inner = a._inner
a._inner._refers_to_instances.add(instance)
a._inner.__merge__(old_inner, **kwargs)
a._inner.__merge__(old_inner, override=override)
del old_inner
self.id = a_id
def __merge__(self, __other: Optional[OuterProxy], **kwargs):
self.merge(__other, **kwargs)
def __merge__(self, __other: Optional[OuterProxy], override: bool = False):
self.merge(__other, override)
def mark_as_fetched(self, *url_hash_list: List[str]):
for url_hash in url_hash_list:
@ -259,23 +235,7 @@ class OuterProxy:
@property
def options(self) -> List[P]:
r = []
for collection_string_attribute in self.UPWARDS_COLLECTION_STRING_ATTRIBUTES:
r.extend(self.__getattribute__(collection_string_attribute))
r.append(self)
for collection_string_attribute in self.DOWNWARDS_COLLECTION_STRING_ATTRIBUTES:
r.extend(self.__getattribute__(collection_string_attribute))
return r
@property
def option_string(self) -> str:
return self.title_string
INDEX_DEPENDS_ON: List[str] = []
return [self]
@property
def indexing_values(self) -> List[Tuple[str, object]]:
@ -307,49 +267,9 @@ class OuterProxy:
return r
@property
def root_collections(self) -> List[Collection]:
if len(self.UPWARDS_COLLECTION_STRING_ATTRIBUTES) == 0:
return [self]
def __repr__(self):
return f"{type(self).__name__}({', '.join(key + ': ' + str(val) for key, val in self.indexing_values)})"
r = []
for collection_string_attribute in self.UPWARDS_COLLECTION_STRING_ATTRIBUTES:
r.extend(self.__getattribute__(collection_string_attribute))
return r
def _compile(self, **kwargs):
pass
def compile(self, from_root=False, **kwargs):
# compile from the root
if not from_root:
for c in self.root_collections:
c.compile(from_root=True, **kwargs)
return
self._compile(**kwargs)
for c_attribute in self.DOWNWARDS_COLLECTION_STRING_ATTRIBUTES:
for c in self.__getattribute__(c_attribute):
c.compile(from_root=True, **kwargs)
TITEL = "id"
@property
def title_string(self) -> str:
return str(self.__getattribute__(self.TITEL)) + (f" {self.id}" if DEBUG_PRINT_ID else "")
@property
def title_value(self) -> str:
return str(self.__getattribute__(self.TITEL))
def __repr__(self):
return f"{type(self).__name__}({self.title_string})"
def get_child_collections(self):
for collection_string_attribute in self.DOWNWARDS_COLLECTION_STRING_ATTRIBUTES:
yield self.__getattribute__(collection_string_attribute)
def get_parent_collections(self):
for collection_string_attribute in self.UPWARDS_COLLECTION_STRING_ATTRIBUTES:
yield self.__getattribute__(collection_string_attribute)

View File

@ -1,82 +1,40 @@
from __future__ import annotations
import copy
import random
from collections import defaultdict
from typing import Dict, List, Optional, Tuple, Type, Union
from typing import List, Optional, Dict, Tuple, Type, Union
import pycountry
from ..utils.config import main_settings
from ..utils.enums.album import AlbumStatus, AlbumType
from ..utils.enums.colors import BColors
from ..utils.shared import DEBUG_PRINT_ID
from ..utils.string_processing import unify
from .artwork import ArtworkCollection
from .collection import AppendHookArguments, Collection
from .contact import Contact
from .country import Country, Language
from ..utils.enums.album import AlbumType, AlbumStatus
from .collection import Collection
from .formatted_text import FormattedText
from .lyrics import Lyrics
from .metadata import ID3Timestamp
from .metadata import Mapping as id3Mapping
from .metadata import Metadata
from .contact import Contact
from .artwork import Artwork
from .metadata import (
Mapping as id3Mapping,
ID3Timestamp,
Metadata
)
from .option import Options
from .parents import OuterProxy
from .parents import OuterProxy as Base
from .parents import P
from .parents import OuterProxy, P
from .source import Source, SourceCollection
from .target import Target
from .country import Language, Country
from ..utils.string_processing import unify
from .parents import OuterProxy as Base
from ..utils.config import main_settings
"""
All Objects dependent
"""
CountryTyping = type(list(pycountry.countries)[0])
OPTION_STRING_DELIMITER = " | "
OPTION_BACKGROUND = BColors.GREY
OPTION_FOREGROUND = BColors.OKBLUE
def get_collection_string(
collection: Collection[Base],
template: str,
ignore_titles: Set[str] = None,
background: BColors = OPTION_BACKGROUND,
foreground: BColors = OPTION_FOREGROUND,
add_id: bool = DEBUG_PRINT_ID,
) -> str:
if collection.empty:
return ""
foreground = foreground.value
background = background.value
ignore_titles = ignore_titles or set()
r = background
def get_element_str(element) -> str:
nonlocal add_id
r = element.title_string.strip()
if add_id and False:
r += " " + str(element.id)
return r
element: Base
titel_list: List[str] = [get_element_str(element) for element in collection if element.title_string not in ignore_titles]
for i, titel in enumerate(titel_list):
delimiter = ", "
if i == len(collection) - 1:
delimiter = ""
elif i == len(collection) - 2:
delimiter = " and "
r += foreground + titel + BColors.ENDC.value + background + delimiter + BColors.ENDC.value
r += BColors.ENDC.value
return template.format(r)
class Song(Base):
title: str
@ -86,13 +44,13 @@ class Song(Base):
genre: str
note: FormattedText
tracksort: int
artwork: ArtworkCollection
artwork: Artwork
source_collection: SourceCollection
target_collection: Collection[Target]
lyrics_collection: Collection[Lyrics]
artist_collection: Collection[Artist]
main_artist_collection: Collection[Artist]
feature_artist_collection: Collection[Artist]
album_collection: Collection[Album]
@ -102,13 +60,13 @@ class Song(Base):
"source_collection": SourceCollection,
"target_collection": Collection,
"lyrics_collection": Collection,
"artwork": ArtworkCollection,
"artwork": Artwork,
"main_artist_collection": Collection,
"album_collection": Collection,
"artist_collection": Collection,
"feature_artist_collection": Collection,
"title": lambda: None,
"title": lambda: "",
"unified_title": lambda: None,
"isrc": lambda: None,
"genre": lambda: None,
@ -116,57 +74,31 @@ class Song(Base):
"tracksort": lambda: 0,
}
def __init__(
self,
title: str = None,
isrc: str = None,
length: int = None,
genre: str = None,
note: FormattedText = None,
source_list: List[Source] = None,
target_list: List[Target] = None,
lyrics_list: List[Lyrics] = None,
artist_list: List[Artist] = None,
feature_artist_list: List[Artist] = None,
album_list: List[Album] = None,
tracksort: int = 0,
artwork: Optional[ArtworkCollection] = None,
**kwargs
) -> None:
real_kwargs = copy.copy(locals())
real_kwargs.update(real_kwargs.pop("kwargs", {}))
def __init__(self, title: str = "", unified_title: str = None, isrc: str = None, length: int = None,
genre: str = None, note: FormattedText = None, source_list: List[Source] = None,
target_list: List[Target] = None, lyrics_list: List[Lyrics] = None,
main_artist_list: List[Artist] = None, feature_artist_list: List[Artist] = None,
album_list: List[Album] = None, tracksort: int = 0, artwork: Optional[Artwork] = None, **kwargs) -> None:
Base.__init__(**real_kwargs)
Base.__init__(**locals())
UPWARDS_COLLECTION_STRING_ATTRIBUTES = ("artist_collection", "feature_artist_collection", "album_collection")
UPWARDS_COLLECTION_STRING_ATTRIBUTES = ("album_collection", "main_artist_collection", "feature_artist_collection")
TITEL = "title"
@staticmethod
def register_artwork_parent(append_hook_arguments: AppendHookArguments):
album: Album = append_hook_arguments.new_object
song: Song
for song in append_hook_arguments.collection_root_objects:
song.artwork.parent_artworks.add(album.artwork)
def __init_collections__(self) -> None:
self.feature_artist_collection.push_to = [self.artist_collection]
self.artist_collection.pull_from = [self.feature_artist_collection]
self.album_collection.sync_on_append = {
"artist_collection": self.artist_collection,
"artist_collection": self.main_artist_collection,
}
self.album_collection.append_object_to_attribute = {
"song_collection": self,
}
self.artist_collection.extend_object_to_attribute = {
"album_collection": self.album_collection
self.main_artist_collection.extend_object_to_attribute = {
"main_album_collection": self.album_collection
}
self.feature_artist_collection.extend_object_to_attribute = {
"album_collection": self.album_collection
self.feature_artist_collection.append_object_to_attribute = {
"feature_song_collection": self
}
self.album_collection.append_callbacks = set((Song.register_artwork_parent, ))
def _add_other_db_objects(self, object_type: Type[OuterProxy], object_list: List[OuterProxy]):
if object_type is Song:
@ -177,25 +109,20 @@ class Song(Base):
return
if isinstance(object_list, Artist):
self.feature_artist_collection.extend(object_list)
self.main_artist_collection.extend(object_list)
return
if isinstance(object_list, Album):
self.album_collection.extend(object_list)
return
def _compile(self):
self.artwork.compile()
INDEX_DEPENDS_ON = ("title", "isrc", "source_collection")
@property
def indexing_values(self) -> List[Tuple[str, object]]:
return [
('id', self.id),
('title', unify(self.title)),
('isrc', self.isrc),
*self.source_collection.indexing_values(),
*[('url', source.url) for source in self.source_collection]
]
@property
@ -207,35 +134,46 @@ class Song(Base):
id3Mapping.GENRE: [self.genre],
id3Mapping.TRACKNUMBER: [self.tracksort_str],
id3Mapping.COMMENT: [self.note.markdown],
id3Mapping.FILE_WEBPAGE_URL: self.source_collection.url_list,
id3Mapping.SOURCE_WEBPAGE_URL: self.source_collection.homepage_list,
})
# metadata.merge_many([s.get_song_metadata() for s in self.source_collection]) album sources have no relevant metadata for id3
metadata.merge_many([a.metadata for a in self.album_collection])
metadata.merge_many([a.metadata for a in self.artist_collection])
metadata.merge_many([a.metadata for a in self.main_artist_collection])
metadata.merge_many([a.metadata for a in self.feature_artist_collection])
metadata.merge_many([lyrics.metadata for lyrics in self.lyrics_collection])
return metadata
def get_artist_credits(self) -> str:
main_artists = ", ".join([artist.name for artist in self.artist_collection])
main_artists = ", ".join([artist.name for artist in self.main_artist_collection])
feature_artists = ", ".join([artist.name for artist in self.feature_artist_collection])
if len(feature_artists) == 0:
return main_artists
return f"{main_artists} feat. {feature_artists}"
def __repr__(self) -> str:
return f"Song(\"{self.title}\")"
@property
def option_string(self) -> str:
r = "song "
r += OPTION_FOREGROUND.value + self.title_string + BColors.ENDC.value + OPTION_BACKGROUND.value
r += get_collection_string(self.album_collection, " from {}", ignore_titles={self.title})
r += get_collection_string(self.artist_collection, " by {}")
r += get_collection_string(self.feature_artist_collection, " feat. {}" if len(self.artist_collection) > 0 else " by {}")
r = f"{self.__repr__()}"
if not self.album_collection.empty:
r += f" from Album({OPTION_STRING_DELIMITER.join(album.title for album in self.album_collection)})"
if not self.main_artist_collection.empty:
r += f" by Artist({OPTION_STRING_DELIMITER.join(artist.name for artist in self.main_artist_collection)})"
if not self.feature_artist_collection.empty:
r += f" feat. Artist({OPTION_STRING_DELIMITER.join(artist.name for artist in self.feature_artist_collection)})"
return r
@property
def options(self) -> List[P]:
options = self.main_artist_collection.shallow_list
options.extend(self.feature_artist_collection)
options.extend(self.album_collection)
options.append(self)
return options
@property
def tracksort_str(self) -> str:
"""
@ -248,6 +186,11 @@ class Song(Base):
return f"{self.tracksort}/{len(self.album_collection[0].song_collection) or 1}"
"""
All objects dependent on Album
"""
class Album(Base):
title: str
unified_title: str
@ -259,12 +202,10 @@ class Album(Base):
albumsort: int
notes: FormattedText
artwork: ArtworkCollection
source_collection: SourceCollection
song_collection: Collection[Song]
artist_collection: Collection[Artist]
feature_artist_collection: Collection[Artist]
song_collection: Collection[Song]
label_collection: Collection[Label]
_default_factories = {
@ -279,71 +220,43 @@ class Album(Base):
"date": ID3Timestamp,
"notes": FormattedText,
"artwork": lambda: ArtworkCollection(crop_images=False),
"source_collection": SourceCollection,
"song_collection": Collection,
"artist_collection": Collection,
"feature_artist_collection": Collection,
"song_collection": Collection,
"label_collection": Collection,
}
TITEL = "title"
# This is automatically generated
def __init__(
self,
title: str = None,
unified_title: str = None,
album_status: AlbumStatus = None,
album_type: AlbumType = None,
language: Language = None,
date: ID3Timestamp = None,
barcode: str = None,
albumsort: int = None,
notes: FormattedText = None,
artwork: ArtworkCollection = None,
source_list: List[Source] = None,
artist_list: List[Artist] = None,
song_list: List[Song] = None,
label_list: List[Label] = None,
**kwargs
) -> None:
real_kwargs = copy.copy(locals())
real_kwargs.update(real_kwargs.pop("kwargs", {}))
Base.__init__(**real_kwargs)
def __init__(self, title: str = None, unified_title: str = None, album_status: AlbumStatus = None,
album_type: AlbumType = None, language: Language = None, date: ID3Timestamp = None,
barcode: str = None, albumsort: int = None, notes: FormattedText = None,
source_list: List[Source] = None, artist_list: List[Artist] = None, song_list: List[Song] = None,
label_list: List[Label] = None, **kwargs) -> None:
super().__init__(title=title, unified_title=unified_title, album_status=album_status, album_type=album_type,
language=language, date=date, barcode=barcode, albumsort=albumsort, notes=notes,
source_list=source_list, artist_list=artist_list, song_list=song_list, label_list=label_list,
**kwargs)
DOWNWARDS_COLLECTION_STRING_ATTRIBUTES = ("song_collection",)
UPWARDS_COLLECTION_STRING_ATTRIBUTES = ("label_collection", "artist_collection")
@staticmethod
def register_artwork_parent(append_hook_arguments: AppendHookArguments):
song: Song = append_hook_arguments.new_object
for root_object in append_hook_arguments.collection_root_objects:
song.artwork.parent_artworks.add(root_object.artwork)
UPWARDS_COLLECTION_STRING_ATTRIBUTES = ("artist_collection", "label_collection")
def __init_collections__(self):
self.feature_artist_collection.push_to = [self.artist_collection]
self.artist_collection.pull_from = [self.feature_artist_collection]
self.song_collection.append_object_to_attribute = {
"album_collection": self
}
self.song_collection.sync_on_append = {
"artist_collection": self.artist_collection
"main_artist_collection": self.artist_collection
}
self.artist_collection.append_object_to_attribute = {
"album_collection": self
"main_album_collection": self
}
self.artist_collection.extend_object_to_attribute = {
"label_collection": self.label_collection
}
self.song_collection.append_callbacks = set((Album.register_artwork_parent, ))
def _add_other_db_objects(self, object_type: Type[OuterProxy], object_list: List[OuterProxy]):
if object_type is Song:
self.song_collection.extend(object_list)
@ -360,14 +273,13 @@ class Album(Base):
self.label_collection.extend(object_list)
return
INDEX_DEPENDS_ON = ("title", "barcode", "source_collection")
@property
def indexing_values(self) -> List[Tuple[str, object]]:
return [
('id', self.id),
('title', unify(self.title)),
('barcode', self.barcode),
*self.source_collection.indexing_values(),
*[('url', source.url) for source in self.source_collection]
]
@property
@ -390,38 +302,20 @@ class Album(Base):
id3Mapping.ALBUMSORTORDER: [str(self.albumsort)] if self.albumsort is not None else []
})
def __repr__(self):
return f"Album(\"{self.title}\")"
@property
def option_string(self) -> str:
r = "album "
r += OPTION_FOREGROUND.value + self.title_string + BColors.ENDC.value + OPTION_BACKGROUND.value
r += get_collection_string(self.artist_collection, " by {}")
if len(self.artist_collection) <= 0:
r += get_collection_string(self.feature_artist_collection, " by {}")
r += get_collection_string(self.label_collection, " under {}")
return f"{self.__repr__()} " \
f"by Artist({OPTION_STRING_DELIMITER.join([artist.name + str(artist.id) for artist in self.artist_collection])}) " \
f"under Label({OPTION_STRING_DELIMITER.join([label.name for label in self.label_collection])})"
if len(self.song_collection) > 0:
r += f" with {len(self.song_collection)} songs"
return r
@property
def options(self) -> List[P]:
options = [*self.artist_collection, self, *self.song_collection]
def _compile(self):
self.analyze_implied_album_type()
self.update_tracksort()
self.fix_artist_collection()
def analyze_implied_album_type(self):
# if the song collection has only one song, it is reasonable to assume that it is a single
if len(self.song_collection) == 1:
self.album_type = AlbumType.SINGLE
return
# if the album already has an album type, we don't need to do anything
if self.album_type is not AlbumType.OTHER:
return
# for information on EP's I looked at https://www.reddit.com/r/WeAreTheMusicMakers/comments/a354ql/whats_the_cutoff_length_between_ep_and_album/
if len(self.song_collection) < 9:
self.album_type = AlbumType.EP
return
return options
def update_tracksort(self):
"""
@ -448,15 +342,17 @@ class Album(Base):
tracksort_map[i] = existing_list.pop(0)
tracksort_map[i].tracksort = i
def fix_artist_collection(self):
def compile(self, merge_into: bool = False):
"""
I add artists, that could only be feature artists to the feature artist collection.
They get automatically moved to main artist collection, if a matching artist exists in the main artist collection or is appended to it later on.
If I am not sure for any artist, I try to analyze the most common artist in the song collection of one album.
compiles the recursive structures,
and does depending on the object some other stuff.
no need to override if only the recursive structure should be built.
override self.build_recursive_structures() instead
"""
# move all artists that are in all feature_artist_collections, of every song, to the artist_collection
pass
self.update_tracksort()
self._build_recursive_structures(build_version=random.randint(0, 99999), merge=merge_into)
@property
def copyright(self) -> str:
@ -489,38 +385,43 @@ class Album(Base):
return self.album_type.value
"""
All objects dependent on Artist
"""
class Artist(Base):
name: str
unified_name: str
country: Country
formed_in: ID3Timestamp
notes: FormattedText
lyrical_themes: List[str]
general_genre: str
unformatted_location: str
artwork: ArtworkCollection
unformated_location: str
source_collection: SourceCollection
contact_collection: Collection[Contact]
album_collection: Collection[Album]
feature_song_collection: Collection[Song]
main_album_collection: Collection[Album]
label_collection: Collection[Label]
_default_factories = {
"name": lambda: None,
"name": str,
"unified_name": lambda: None,
"country": lambda: None,
"unformatted_location": lambda: None,
"unformated_location": lambda: None,
"formed_in": ID3Timestamp,
"notes": FormattedText,
"lyrical_themes": list,
"general_genre": lambda: "",
"artwork": ArtworkCollection,
"source_collection": SourceCollection,
"album_collection": Collection,
"feature_song_collection": Collection,
"main_album_collection": Collection,
"contact_collection": Collection,
"label_collection": Collection,
}
@ -528,38 +429,30 @@ class Artist(Base):
TITEL = "name"
# This is automatically generated
def __init__(
self,
name: str = None,
unified_name: str = None,
country: Country = None,
formed_in: ID3Timestamp = None,
notes: FormattedText = None,
lyrical_themes: List[str] = None,
general_genre: str = None,
artwork: ArtworkCollection = None,
unformatted_location: str = None,
source_list: List[Source] = None,
contact_list: List[Contact] = None,
feature_song_list: List[Song] = None,
album_list: List[Album] = None,
label_list: List[Label] = None,
**kwargs
) -> None:
real_kwargs = copy.copy(locals())
real_kwargs.update(real_kwargs.pop("kwargs", {}))
def __init__(self, name: str = "", unified_name: str = None, country: Country = None,
formed_in: ID3Timestamp = None, notes: FormattedText = None, lyrical_themes: List[str] = None,
general_genre: str = None, unformated_location: str = None, source_list: List[Source] = None,
contact_list: List[Contact] = None, feature_song_list: List[Song] = None,
main_album_list: List[Album] = None, label_list: List[Label] = None, **kwargs) -> None:
Base.__init__(**real_kwargs)
super().__init__(name=name, unified_name=unified_name, country=country, formed_in=formed_in, notes=notes,
lyrical_themes=lyrical_themes, general_genre=general_genre,
unformated_location=unformated_location, source_list=source_list, contact_list=contact_list,
feature_song_list=feature_song_list, main_album_list=main_album_list, label_list=label_list,
**kwargs)
DOWNWARDS_COLLECTION_STRING_ATTRIBUTES = ("album_collection",)
DOWNWARDS_COLLECTION_STRING_ATTRIBUTES = ("feature_song_collection", "main_album_collection")
UPWARDS_COLLECTION_STRING_ATTRIBUTES = ("label_collection",)
def __init_collections__(self):
self.album_collection.append_object_to_attribute = {
self.feature_song_collection.append_object_to_attribute = {
"feature_artist_collection": self
}
self.main_album_collection.append_object_to_attribute = {
"artist_collection": self
}
self.label_collection.append_object_to_attribute = {
"current_artist_collection": self
}
@ -567,32 +460,39 @@ class Artist(Base):
def _add_other_db_objects(self, object_type: Type[OuterProxy], object_list: List[OuterProxy]):
if object_type is Song:
# this doesn't really make sense
# self.feature_song_collection.extend(object_list)
return
if object_type is Artist:
return
if object_type is Album:
self.album_collection.extend(object_list)
self.main_album_collection.extend(object_list)
return
if object_type is Label:
self.label_collection.extend(object_list)
return
def _compile(self):
self.update_albumsort()
@property
def options(self) -> List[P]:
options = [self, *self.main_album_collection.shallow_list, *self.feature_album]
print(options)
return options
def update_albumsort(self):
"""
This updates the albumsort attributes, of the albums in
`self.album_collection`, and sorts the albums, if possible.
`self.main_album_collection`, and sorts the albums, if possible.
It is advised to only call this function, once all the albums are
added to the artist.
:return:
"""
if len(self.main_album_collection) <= 0:
return
type_section: Dict[AlbumType, int] = defaultdict(lambda: 2, {
AlbumType.OTHER: 0, # if I don't know it, I add it to the first section
AlbumType.STUDIO_ALBUM: 0,
@ -604,7 +504,7 @@ class Artist(Base):
# order albums in the previously defined section
album: Album
for album in self.album_collection:
for album in self.main_album_collection:
sections[type_section[album.album_type]].append(album)
def sort_section(_section: List[Album], last_albumsort: int) -> int:
@ -635,39 +535,85 @@ class Artist(Base):
album_list.extend(sections[section_index])
# replace the old collection with the new one
self.album_collection._data = album_list
self.main_album_collection: Collection = Collection(data=album_list, element_type=Album)
INDEX_DEPENDS_ON = ("name", "source_collection", "contact_collection")
@property
def indexing_values(self) -> List[Tuple[str, object]]:
return [
('id', self.id),
('name', unify(self.name)),
*[('contact', contact.value) for contact in self.contact_collection],
*self.source_collection.indexing_values(),
*[('url', source.url) for source in self.source_collection],
*[('contact', contact.value) for contact in self.contact_collection]
]
@property
def metadata(self) -> Metadata:
metadata = Metadata({
id3Mapping.ARTIST: [self.name],
id3Mapping.ARTIST_WEBPAGE_URL: self.source_collection.url_list,
id3Mapping.ARTIST: [self.name]
})
metadata.merge_many([s.get_artist_metadata() for s in self.source_collection])
return metadata
"""
def __str__(self, include_notes: bool = False):
string = self.name or ""
if include_notes:
plaintext_notes = self.notes.get_plaintext()
if plaintext_notes is not None:
string += "\n" + plaintext_notes
return string
"""
def __repr__(self):
return f"Artist(\"{self.name}\")"
@property
def option_string(self) -> str:
r = "artist "
r += OPTION_FOREGROUND.value + self.title_string + BColors.ENDC.value + OPTION_BACKGROUND.value
r += get_collection_string(self.label_collection, " under {}")
return f"{self.__repr__()} " \
f"under Label({OPTION_STRING_DELIMITER.join([label.name for label in self.label_collection])})"
r += OPTION_BACKGROUND.value
if len(self.album_collection) > 0:
r += f" with {len(self.album_collection)} albums"
@property
def options(self) -> List[P]:
options = [self]
options.extend(self.main_album_collection)
options.extend(self.feature_song_collection)
return options
r += BColors.ENDC.value
@property
def feature_album(self) -> Album:
return Album(
title="features",
album_status=AlbumStatus.UNRELEASED,
album_type=AlbumType.COMPILATION_ALBUM,
is_split=True,
albumsort=666,
dynamic=True,
song_list=self.feature_song_collection.shallow_list
)
return r
def get_all_songs(self) -> List[Song]:
"""
returns a list of all Songs.
probably not that useful, because it is unsorted
"""
collection = self.feature_song_collection.copy()
for album in self.discography:
collection.extend(album.song_collection)
return collection
@property
def discography(self) -> List[Album]:
flat_copy_discography = self.main_album_collection.copy()
flat_copy_discography.append(self.feature_album)
return flat_copy_discography
"""
Label
"""
class Label(Base):
@ -697,21 +643,12 @@ class Label(Base):
TITEL = "name"
def __init__(
self,
name: str = None,
unified_name: str = None,
notes: FormattedText = None,
source_list: List[Source] = None,
contact_list: List[Contact] = None,
album_list: List[Album] = None,
current_artist_list: List[Artist] = None,
**kwargs
) -> None:
real_kwargs = copy.copy(locals())
real_kwargs.update(real_kwargs.pop("kwargs", {}))
Base.__init__(**real_kwargs)
def __init__(self, name: str = None, unified_name: str = None, notes: FormattedText = None,
source_list: List[Source] = None, contact_list: List[Contact] = None,
album_list: List[Album] = None, current_artist_list: List[Artist] = None, **kwargs) -> None:
super().__init__(name=name, unified_name=unified_name, notes=notes, source_list=source_list,
contact_list=contact_list, album_list=album_list, current_artist_list=current_artist_list,
**kwargs)
def __init_collections__(self):
self.album_collection.append_object_to_attribute = {
@ -725,6 +662,7 @@ class Label(Base):
@property
def indexing_values(self) -> List[Tuple[str, object]]:
return [
('id', self.id),
('name', unify(self.name)),
*[('url', source.url) for source in self.source_collection]
]
@ -751,4 +689,4 @@ class Label(Base):
@property
def option_string(self):
return "label " + OPTION_FOREGROUND.value + self.name + BColors.ENDC.value
return self.__repr__()

View File

@ -2,237 +2,142 @@ from __future__ import annotations
from collections import defaultdict
from enum import Enum
from typing import (
List,
Dict,
Set,
Tuple,
Optional,
Iterable,
Generator,
TypedDict,
Callable,
Any,
TYPE_CHECKING
)
from urllib.parse import urlparse, ParseResult
from dataclasses import dataclass, field
from functools import cached_property
from typing import List, Dict, Set, Tuple, Optional, Iterable
from urllib.parse import urlparse
from ..utils import generate_id
from ..utils.enums import SourceType, ALL_SOURCE_TYPES
from ..utils.enums.source import SourcePages, SourceTypes
from ..utils.config import youtube_settings
from ..utils.string_processing import hash_url, shorten_display_url
from ..utils.string_processing import hash_url
from .metadata import Mapping, Metadata
if TYPE_CHECKING:
from ..pages.abstract import Page
from .parents import OuterProxy
from .collection import Collection
@dataclass
class Source:
source_type: SourceType
class Source(OuterProxy):
url: str
referrer_page: SourceType = None
audio_url: Optional[str] = None
additional_data: dict = field(default_factory=dict)
page_enum: SourcePages
referer_page: SourcePages
def __post_init__(self):
self.referrer_page = self.referrer_page or self.source_type
audio_url: str
_default_factories = {
"audio_url": lambda: None,
}
# This is automatically generated
def __init__(self, page_enum: SourcePages, url: str, referer_page: SourcePages = None, audio_url: str = None,
**kwargs) -> None:
if referer_page is None:
referer_page = page_enum
super().__init__(url=url, page_enum=page_enum, referer_page=referer_page, audio_url=audio_url, **kwargs)
@classmethod
def match_url(cls, url: str, referrer_page: SourceType) -> Optional[Source]:
def match_url(cls, url: str, referer_page: SourcePages) -> Optional["Source"]:
"""
this shouldn't be used, unless you are not certain what the source is for
this shouldn't be used, unlesse you are not certain what the source is for
the reason is that it is more inefficient
"""
parsed_url = urlparse(url)
url = parsed_url.geturl()
parsed = urlparse(url)
url = parsed.geturl()
if "musify" in parsed_url.netloc:
return cls(ALL_SOURCE_TYPES.MUSIFY, url, referrer_page=referrer_page)
if "musify" in parsed.netloc:
return cls(SourcePages.MUSIFY, url, referer_page=referer_page)
if parsed_url.netloc in [_url.netloc for _url in youtube_settings['youtube_url']]:
return cls(ALL_SOURCE_TYPES.YOUTUBE, url, referrer_page=referrer_page)
if parsed.netloc in [_url.netloc for _url in youtube_settings['youtube_url']]:
return cls(SourcePages.YOUTUBE, url, referer_page=referer_page)
if url.startswith("https://www.deezer"):
return cls(ALL_SOURCE_TYPES.DEEZER, url, referrer_page=referrer_page)
return cls(SourcePages.DEEZER, url, referer_page=referer_page)
if url.startswith("https://open.spotify.com"):
return cls(ALL_SOURCE_TYPES.SPOTIFY, url, referrer_page=referrer_page)
return cls(SourcePages.SPOTIFY, url, referer_page=referer_page)
if "bandcamp" in url:
return cls(ALL_SOURCE_TYPES.BANDCAMP, url, referrer_page=referrer_page)
return cls(SourcePages.BANDCAMP, url, referer_page=referer_page)
if "wikipedia" in parsed_url.netloc:
return cls(ALL_SOURCE_TYPES.WIKIPEDIA, url, referrer_page=referrer_page)
if "wikipedia" in parsed.netloc:
return cls(SourcePages.WIKIPEDIA, url, referer_page=referer_page)
if url.startswith("https://www.metal-archives.com/"):
return cls(ALL_SOURCE_TYPES.ENCYCLOPAEDIA_METALLUM, url, referrer_page=referrer_page)
return cls(SourcePages.ENCYCLOPAEDIA_METALLUM, url, referer_page=referer_page)
# the less important once
if url.startswith("https://www.facebook"):
return cls(ALL_SOURCE_TYPES.FACEBOOK, url, referrer_page=referrer_page)
return cls(SourcePages.FACEBOOK, url, referer_page=referer_page)
if url.startswith("https://www.instagram"):
return cls(ALL_SOURCE_TYPES.INSTAGRAM, url, referrer_page=referrer_page)
return cls(SourcePages.INSTAGRAM, url, referer_page=referer_page)
if url.startswith("https://twitter"):
return cls(ALL_SOURCE_TYPES.TWITTER, url, referrer_page=referrer_page)
return cls(SourcePages.TWITTER, url, referer_page=referer_page)
if url.startswith("https://myspace.com"):
return cls(ALL_SOURCE_TYPES.MYSPACE, url, referrer_page=referrer_page)
return cls(SourcePages.MYSPACE, url, referer_page=referer_page)
@property
def has_page(self) -> bool:
return self.source_type.page is not None
def get_song_metadata(self) -> Metadata:
return Metadata({
Mapping.FILE_WEBPAGE_URL: [self.url],
Mapping.SOURCE_WEBPAGE_URL: [self.homepage]
})
@property
def page(self) -> Page:
return self.source_type.page
@property
def parsed_url(self) -> ParseResult:
return urlparse(self.url)
def get_artist_metadata(self) -> Metadata:
return Metadata({
Mapping.ARTIST_WEBPAGE_URL: [self.url]
})
@property
def hash_url(self) -> str:
return hash_url(self.url)
@property
def indexing_values(self) -> list:
r = [hash_url(self.url)]
if self.audio_url:
r.append(hash_url(self.audio_url))
return r
def metadata(self) -> Metadata:
return self.get_song_metadata()
@property
def indexing_values(self) -> List[Tuple[str, object]]:
return [
('id', self.id),
('url', self.url),
('audio_url', self.audio_url),
]
def __str__(self):
return self.__repr__()
def __repr__(self) -> str:
return f"Src({self.source_type.value}: {shorten_display_url(self.url)})"
return f"Src({self.page_enum.value}: {self.url}, {self.audio_url})"
def __merge__(self, other: Source, **kwargs):
if self.audio_url is None:
self.audio_url = other.audio_url
self.additional_data.update(other.additional_data)
@property
def title_string(self) -> str:
return self.url
page_str = property(fget=lambda self: self.source_type.value)
page_str = property(fget=lambda self: self.page_enum.value)
type_str = property(fget=lambda self: self.type_enum.value)
homepage = property(fget=lambda self: SourcePages.get_homepage(self.page_enum))
class SourceTypeSorting(TypedDict):
sort_key: Callable[[SourceType], Any]
reverse: bool
only_with_page: bool
class SourceCollection:
__change_version__ = generate_id()
_indexed_sources: Dict[str, Source]
_sources_by_type: Dict[SourceType, List[Source]]
class SourceCollection(Collection):
def __init__(self, data: Optional[Iterable[Source]] = None, **kwargs):
self._sources_by_type = defaultdict(list)
self._indexed_sources = {}
self._page_to_source_list: Dict[SourcePages, List[Source]] = defaultdict(list)
self.extend(data or [])
super().__init__(data=data, **kwargs)
def source_types(
self,
only_with_page: bool = False,
sort_key = lambda page: page.name,
reverse: bool = False
) -> Iterable[SourceType]:
"""
Returns a list of all source types contained in this source collection.
def _map_element(self, __object: Source, **kwargs):
super()._map_element(__object, **kwargs)
Args:
only_with_page (bool, optional): If True, only returns source types that have a page, meaning you can download from them.
sort_key (function, optional): A function that defines the sorting key for the source types. Defaults to lambda page: page.name.
reverse (bool, optional): If True, sorts the source types in reverse order. Defaults to False.
Returns:
Iterable[SourceType]: A list of source types.
"""
source_types: List[SourceType] = self._sources_by_type.keys()
if only_with_page:
source_types = filter(lambda st: st.has_page, source_types)
return sorted(
source_types,
key=sort_key,
reverse=reverse
)
def get_sources(self, *source_types: List[SourceType], source_type_sorting: SourceTypeSorting = None) -> Generator[Source]:
"""
Retrieves sources based on the provided source types and source type sorting.
Args:
*source_types (List[Source]): Variable number of source types to filter the sources.
source_type_sorting (SourceTypeSorting): Sorting criteria for the source types. This is only relevant if no source types are provided.
Yields:
Generator[Source]: A generator that yields the sources based on the provided filters.
Returns:
None
"""
if not len(source_types):
source_type_sorting = source_type_sorting or {}
source_types = self.source_types(**source_type_sorting)
for source_type in source_types:
yield from self._sources_by_type[source_type]
def append(self, source: Source):
if source is None:
return
existing_source = None
for key in source.indexing_values:
if key in self._indexed_sources:
existing_source = self._indexed_sources[key]
break
if existing_source is not None:
existing_source.__merge__(source)
source = existing_source
else:
self._sources_by_type[source.source_type].append(source)
changed = False
for key in source.indexing_values:
if key not in self._indexed_sources:
changed = True
self._indexed_sources[key] = source
if changed:
self.__change_version__ = generate_id()
def extend(self, sources: Iterable[Source]):
for source in sources:
self.append(source)
def __iter__(self):
yield from self.get_sources()
def __merge__(self, other: SourceCollection, **kwargs):
self.extend(other)
self._page_to_source_list[__object.page_enum].append(__object)
@property
def hash_url_list(self) -> List[str]:
return [hash_url(source.url) for source in self.get_sources()]
def source_pages(self) -> Set[SourcePages]:
return set(source.page_enum for source in self._data)
@property
def url_list(self) -> List[str]:
return [source.url for source in self.get_sources()]
@property
def homepage_list(self) -> List[str]:
return [source_type.homepage for source_type in self._sources_by_type.keys()]
def indexing_values(self) -> Generator[Tuple[str, str], None, None]:
for index in self._indexed_sources:
yield "url", index
def get_sources_from_page(self, source_page: SourcePages) -> List[Source]:
"""
getting the sources for a specific page like
YouTube or musify
"""
return self._page_to_source_list[source_page].copy()

View File

@ -1,17 +1,17 @@
from __future__ import annotations
from pathlib import Path
from typing import List, Tuple, TextIO, Union
import logging
import random
from pathlib import Path
from typing import List, Optional, TextIO, Tuple, Union
import requests
from tqdm import tqdm
from ..utils.config import logging_settings, main_settings
from ..utils.shared import HIGHEST_ID
from ..utils.string_processing import fit_to_file_system
from .parents import OuterProxy
from ..utils.shared import HIGHEST_ID
from ..utils.config import main_settings, logging_settings
from ..utils.string_processing import fit_to_file_system
LOGGER = logging.getLogger("target")
@ -31,11 +31,7 @@ class Target(OuterProxy):
}
@classmethod
def temp(cls, name: str = None, file_extension: Optional[str] = None) -> P:
name = name or str(random.randint(0, HIGHEST_ID))
if file_extension is not None:
name = f"{name}.{file_extension}"
def temp(cls, name: str = str(random.randint(0, HIGHEST_ID))) -> P:
return cls(main_settings["temp_directory"] / name)
# This is automatically generated
@ -118,11 +114,3 @@ class Target(OuterProxy):
def read_bytes(self) -> bytes:
return self.file_path.read_bytes()
@property
def raw_content(self) -> bytes:
return self.file_path.read_bytes()
@raw_content.setter
def raw_content(self, content: bytes):
self.file_path.write_bytes(content)

View File

@ -3,6 +3,5 @@ from .musify import Musify
from .youtube import YouTube
from .youtube_music import YoutubeMusic
from .bandcamp import Bandcamp
from .genius import Genius
from .abstract import Page, INDEPENDENT_DB_OBJECTS

View File

@ -3,9 +3,8 @@ import random
import re
from copy import copy
from pathlib import Path
from typing import Optional, Union, Type, Dict, Set, List, Tuple, TypedDict
from typing import Optional, Union, Type, Dict, Set, List, Tuple
from string import Formatter
from dataclasses import dataclass, field
import requests
from bs4 import BeautifulSoup
@ -22,45 +21,131 @@ from ..objects import (
Collection,
Label,
)
from ..utils.enums import SourceType
from ..utils.enums.source import SourcePages
from ..utils.enums.album import AlbumType
from ..audio import write_metadata_to_target, correct_codec
from ..utils.config import main_settings
from ..utils.support_classes.query import Query
from ..utils.support_classes.download_result import DownloadResult
from ..utils.string_processing import fit_to_file_system
from ..utils import trace, output, BColors
from ..utils import trace
INDEPENDENT_DB_OBJECTS = Union[Label, Album, Artist, Song]
INDEPENDENT_DB_TYPES = Union[Type[Song], Type[Album], Type[Artist], Type[Label]]
@dataclass
class FetchOptions:
download_all: bool = False
album_type_blacklist: Set[AlbumType] = field(default_factory=lambda: set(AlbumType(a) for a in main_settings["album_type_blacklist"]))
@dataclass
class DownloadOptions:
download_all: bool = False
album_type_blacklist: Set[AlbumType] = field(default_factory=lambda: set(AlbumType(a) for a in main_settings["album_type_blacklist"]))
class NamingDict(dict):
CUSTOM_KEYS: Dict[str, str] = {
"label": "label.name",
"artist": "artist.name",
"song": "song.title",
"isrc": "song.isrc",
"album": "album.title",
"album_type": "album.album_type_string"
}
def __init__(self, values: dict, object_mappings: Dict[str, DatabaseObject] = None):
self.object_mappings: Dict[str, DatabaseObject] = object_mappings or dict()
super().__init__(values)
self["audio_format"] = main_settings["audio_format"]
def add_object(self, music_object: DatabaseObject):
self.object_mappings[type(music_object).__name__.lower()] = music_object
def copy(self) -> dict:
return type(self)(super().copy(), self.object_mappings.copy())
def __getitem__(self, key: str) -> str:
return fit_to_file_system(super().__getitem__(key))
def default_value_for_name(self, name: str) -> str:
return f'Various {name.replace("_", " ").title()}'
def __missing__(self, key: str) -> str:
if "." not in key:
if key not in self.CUSTOM_KEYS:
return self.default_value_for_name(key)
key = self.CUSTOM_KEYS[key]
frag_list = key.split(".")
object_name = frag_list[0].strip().lower()
attribute_name = frag_list[-1].strip().lower()
if object_name not in self.object_mappings:
return self.default_value_for_name(attribute_name)
music_object = self.object_mappings[object_name]
try:
value = getattr(music_object, attribute_name)
if value is None:
return self.default_value_for_name(attribute_name)
return str(value)
except AttributeError:
return self.default_value_for_name(attribute_name)
def _clean_music_object(music_object: INDEPENDENT_DB_OBJECTS, collections: Dict[INDEPENDENT_DB_TYPES, Collection]):
if type(music_object) == Label:
return _clean_label(label=music_object, collections=collections)
if type(music_object) == Artist:
return _clean_artist(artist=music_object, collections=collections)
if type(music_object) == Album:
return _clean_album(album=music_object, collections=collections)
if type(music_object) == Song:
return _clean_song(song=music_object, collections=collections)
def _clean_collection(collection: Collection, collection_dict: Dict[INDEPENDENT_DB_TYPES, Collection]):
if collection.element_type not in collection_dict:
return
for i, element in enumerate(collection):
r = collection_dict[collection.element_type].append(element, merge_into_existing=True)
collection[i] = r.current_element
if not r.was_the_same:
_clean_music_object(r.current_element, collection_dict)
def _clean_label(label: Label, collections: Dict[INDEPENDENT_DB_TYPES, Collection]):
_clean_collection(label.current_artist_collection, collections)
_clean_collection(label.album_collection, collections)
def _clean_artist(artist: Artist, collections: Dict[INDEPENDENT_DB_TYPES, Collection]):
_clean_collection(artist.main_album_collection, collections)
_clean_collection(artist.feature_song_collection, collections)
_clean_collection(artist.label_collection, collections)
def _clean_album(album: Album, collections: Dict[INDEPENDENT_DB_TYPES, Collection]):
_clean_collection(album.label_collection, collections)
_clean_collection(album.song_collection, collections)
_clean_collection(album.artist_collection, collections)
def _clean_song(song: Song, collections: Dict[INDEPENDENT_DB_TYPES, Collection]):
_clean_collection(song.album_collection, collections)
_clean_collection(song.feature_artist_collection, collections)
_clean_collection(song.main_artist_collection, collections)
process_audio_if_found: bool = False
process_metadata_if_found: bool = True
class Page:
SOURCE_TYPE: SourceType
LOGGER: logging.Logger
"""
This is an abstract class, laying out the
functionality for every other class fetching something
"""
def __new__(cls, *args, **kwargs):
cls.LOGGER = logging.getLogger(cls.__name__)
SOURCE_TYPE: SourcePages
LOGGER = logging.getLogger("this shouldn't be used")
return super().__new__(cls)
def __init__(self, download_options: DownloadOptions = None, fetch_options: FetchOptions = None):
self.SOURCE_TYPE.register_page(self)
self.download_options: DownloadOptions = download_options or DownloadOptions()
self.fetch_options: FetchOptions = fetch_options or FetchOptions()
# set this to true, if all song details can also be fetched by fetching album details
NO_ADDITIONAL_DATA_FROM_SONG = False
def _search_regex(self, pattern, string, default=None, fatal=True, flags=0, group=None):
"""
@ -133,7 +218,106 @@ class Page:
def song_search(self, song: Song) -> List[Song]:
return []
# to fetch stuff
def fetch_details(
self,
music_object: DatabaseObject,
stop_at_level: int = 1,
post_process: bool = True
) -> DatabaseObject:
"""
when a music object with lacking data is passed in, it returns
the SAME object **(no copy)** with more detailed data.
If you for example put in, an album, it fetches the tracklist
:param music_object:
:param stop_at_level:
This says the depth of the level the scraper will recurse to.
If this is for example set to 2, then the levels could be:
1. Level: the album
2. Level: every song of the album + every artist of the album
If no additional requests are needed to get the data one level below the supposed stop level
this gets ignored
:return detailed_music_object: IT MODIFIES THE INPUT OBJ
"""
# creating a new object, of the same type
new_music_object: Optional[DatabaseObject] = None
fetched_from_url: List[str] = []
# only certain database objects, have a source list
if isinstance(music_object, INDEPENDENT_DB_OBJECTS):
source: Source
for source in music_object.source_collection.get_sources_from_page(self.SOURCE_TYPE):
if music_object.already_fetched_from(source.hash_url):
continue
tmp = self.fetch_object_from_source(
source=source,
enforce_type=type(music_object),
stop_at_level=stop_at_level,
post_process=False,
type_string=type(music_object).__name__,
title_string=music_object.title_string,
)
if new_music_object is None:
new_music_object = tmp
else:
new_music_object.merge(tmp)
fetched_from_url.append(source.hash_url)
if new_music_object is not None:
music_object.merge(new_music_object)
music_object.mark_as_fetched(*fetched_from_url)
return music_object
def fetch_object_from_source(
self,
source: Source,
stop_at_level: int = 2,
enforce_type: Type[DatabaseObject] = None,
post_process: bool = True,
type_string: str = "",
title_string: str = "",
) -> Optional[DatabaseObject]:
obj_type = self.get_source_type(source)
if obj_type is None:
return None
if enforce_type != obj_type and enforce_type is not None:
self.LOGGER.warning(f"Object type isn't type to enforce: {enforce_type}, {obj_type}")
return None
music_object: DatabaseObject = None
fetch_map = {
Song: self.fetch_song,
Album: self.fetch_album,
Artist: self.fetch_artist,
Label: self.fetch_label
}
if obj_type in fetch_map:
music_object = fetch_map[obj_type](source, stop_at_level)
else:
self.LOGGER.warning(f"Can't fetch details of type: {obj_type}")
return None
if stop_at_level > 0:
trace(f"fetching {type_string} [{title_string}] [stop_at_level={stop_at_level}]")
collection: Collection
for collection_str in music_object.DOWNWARDS_COLLECTION_STRING_ATTRIBUTES:
collection = music_object.__getattribute__(collection_str)
for sub_element in collection:
sub_element.merge(
self.fetch_details(sub_element, stop_at_level=stop_at_level - 1, post_process=False))
return music_object
def fetch_song(self, source: Source, stop_at_level: int = 1) -> Song:
return Song()
@ -146,7 +330,155 @@ class Page:
def fetch_label(self, source: Source, stop_at_level: int = 1) -> Label:
return Label()
# to download stuff
def download(
self,
music_object: DatabaseObject,
genre: str,
download_all: bool = False,
process_metadata_anyway: bool = True
) -> DownloadResult:
naming_dict: NamingDict = NamingDict({"genre": genre})
def fill_naming_objects(naming_music_object: DatabaseObject):
nonlocal naming_dict
for collection_name in naming_music_object.UPWARDS_COLLECTION_STRING_ATTRIBUTES:
collection: Collection = getattr(naming_music_object, collection_name)
if collection.empty:
continue
dom_ordered_music_object: DatabaseObject = collection[0]
naming_dict.add_object(dom_ordered_music_object)
return fill_naming_objects(dom_ordered_music_object)
fill_naming_objects(music_object)
return self._download(music_object, naming_dict, download_all, process_metadata_anyway=process_metadata_anyway)
def _download(
self,
music_object: DatabaseObject,
naming_dict: NamingDict,
download_all: bool = False,
skip_details: bool = False,
process_metadata_anyway: bool = True
) -> DownloadResult:
trace(f"downloading {type(music_object).__name__} [{music_object.title_string}]")
skip_next_details = skip_details
# Skips all releases, that are defined in shared.ALBUM_TYPE_BLACKLIST, if download_all is False
if isinstance(music_object, Album):
if self.NO_ADDITIONAL_DATA_FROM_SONG:
skip_next_details = True
if not download_all and music_object.album_type.value in main_settings["album_type_blacklist"]:
return DownloadResult()
if not (isinstance(music_object, Song) and self.NO_ADDITIONAL_DATA_FROM_SONG):
self.fetch_details(music_object=music_object, stop_at_level=1)
if isinstance(music_object, Album):
music_object.update_tracksort()
naming_dict.add_object(music_object)
if isinstance(music_object, Song):
return self._download_song(music_object, naming_dict, process_metadata_anyway=process_metadata_anyway)
download_result: DownloadResult = DownloadResult()
for collection_name in music_object.DOWNWARDS_COLLECTION_STRING_ATTRIBUTES:
collection: Collection = getattr(music_object, collection_name)
sub_ordered_music_object: DatabaseObject
for sub_ordered_music_object in collection:
download_result.merge(self._download(sub_ordered_music_object, naming_dict.copy(), download_all,
skip_details=skip_next_details,
process_metadata_anyway=process_metadata_anyway))
return download_result
def _download_song(self, song: Song, naming_dict: NamingDict, process_metadata_anyway: bool = True):
if "genre" not in naming_dict and song.genre is not None:
naming_dict["genre"] = song.genre
if song.genre is None:
song.genre = naming_dict["genre"]
path_parts = Formatter().parse(main_settings["download_path"])
file_parts = Formatter().parse(main_settings["download_file"])
new_target = Target(
relative_to_music_dir=True,
file_path=Path(
main_settings["download_path"].format(**{part[1]: naming_dict[part[1]] for part in path_parts}),
main_settings["download_file"].format(**{part[1]: naming_dict[part[1]] for part in file_parts})
)
)
if song.target_collection.empty:
song.target_collection.append(new_target)
sources = song.source_collection.get_sources_from_page(self.SOURCE_TYPE)
if len(sources) == 0:
return DownloadResult(error_message=f"No source found for {song.title} as {self.__class__.__name__}.")
temp_target: Target = Target(
relative_to_music_dir=False,
file_path=Path(
main_settings["temp_directory"],
str(song.id)
)
)
r = DownloadResult(1)
found_on_disc = False
target: Target
for target in song.target_collection:
if target.exists:
if process_metadata_anyway:
target.copy_content(temp_target)
found_on_disc = True
r.found_on_disk += 1
r.add_target(target)
if found_on_disc and not process_metadata_anyway:
self.LOGGER.info(f"{song.option_string} already exists, thus not downloading again.")
return r
source = sources[0]
if not found_on_disc:
r = self.download_song_to_target(source=source, target=temp_target, desc=song.title)
if not r.is_fatal_error:
r.merge(self._post_process_targets(song, temp_target,
[] if found_on_disc else self.get_skip_intervals(song, source)))
return r
def _post_process_targets(self, song: Song, temp_target: Target, interval_list: List) -> DownloadResult:
correct_codec(temp_target, interval_list=interval_list)
self.post_process_hook(song, temp_target)
write_metadata_to_target(song.metadata, temp_target, song)
r = DownloadResult()
target: Target
for target in song.target_collection:
if temp_target is not target:
temp_target.copy_content(target)
r.add_target(target)
temp_target.delete()
r.sponsor_segments += len(interval_list)
return r
def get_skip_intervals(self, song: Song, source: Source) -> List[Tuple[float, float]]:
return []

View File

@ -1,22 +1,31 @@
import json
from enum import Enum
from typing import List, Optional, Type
from urllib.parse import urlparse, urlunparse
import pycountry
import json
from enum import Enum
from bs4 import BeautifulSoup
import pycountry
from ..connection import Connection
from ..objects import (Album, Artist, ArtworkCollection, Contact,
DatabaseObject, FormattedText, ID3Timestamp, Label,
Lyrics, Song, Source, SourceType, Target)
from ..utils import dump_to_file
from ..utils.config import logging_settings, main_settings
from ..utils.enums import ALL_SOURCE_TYPES, SourceType
from ..utils.shared import DEBUG
from ..utils.string_processing import clean_song_title
from ..utils.support_classes.download_result import DownloadResult
from ..objects import Source, DatabaseObject
from .abstract import Page
from ..objects import (
Artist,
Source,
SourcePages,
Song,
Album,
Label,
Target,
Contact,
ID3Timestamp,
Lyrics,
FormattedText,
Artwork,
)
from ..connection import Connection
from ..utils.support_classes.download_result import DownloadResult
from ..utils.string_processing import clean_song_title
from ..utils.config import main_settings, logging_settings
from ..utils.shared import DEBUG
if DEBUG:
from ..utils import dump_to_file
@ -39,7 +48,9 @@ class BandcampTypes(Enum):
class Bandcamp(Page):
SOURCE_TYPE = ALL_SOURCE_TYPES.BANDCAMP
# CHANGE
SOURCE_TYPE = SourcePages.BANDCAMP
LOGGER = logging_settings["bandcamp_logger"]
def __init__(self, *args, **kwargs):
self.connection: Connection = Connection(
@ -51,7 +62,8 @@ class Bandcamp(Page):
super().__init__(*args, **kwargs)
def get_source_type(self, source: Source) -> Optional[Type[DatabaseObject]]:
path = source.parsed_url.path.replace("/", "")
parsed_url = urlparse(source.url)
path = parsed_url.path.replace("/", "")
if path == "" or path.startswith("music"):
return Artist
@ -106,7 +118,7 @@ class Bandcamp(Page):
return Song(
title=clean_song_title(name, artist_name=data["band_name"]),
source_list=source_list,
artist_list=[
main_artist_list=[
Artist(
name=data["band_name"],
source_list=[
@ -124,7 +136,7 @@ class Bandcamp(Page):
"full_page": True,
"search_filter": filter_string,
"search_text": search_query,
}, name=f"search_{filter_string}_{search_query}")
})
if r is None:
return results
@ -173,7 +185,7 @@ class Bandcamp(Page):
if li is None and li['href'] is not None:
continue
source_list.append(Source.match_url(_parse_artist_url(li['href']), referrer_page=self.SOURCE_TYPE))
source_list.append(Source.match_url(_parse_artist_url(li['href']), referer_page=self.SOURCE_TYPE))
return Artist(
name=name,
@ -212,7 +224,7 @@ class Bandcamp(Page):
def fetch_artist(self, source: Source, stop_at_level: int = 1) -> Artist:
artist = Artist()
r = self.connection.get(_parse_artist_url(source.url), name=f"artist_{urlparse(source.url).scheme}_{urlparse(source.url).netloc}")
r = self.connection.get(_parse_artist_url(source.url))
if r is None:
return artist
@ -226,12 +238,7 @@ class Bandcamp(Page):
html_music_grid = soup.find("ol", {"id": "music-grid"})
if html_music_grid is not None:
for subsoup in html_music_grid.find_all("li"):
artist.album_collection.append(self._parse_album(soup=subsoup, initial_source=source))
# artist artwork
artist_artwork: BeautifulSoup = soup.find("img", {"class":"band-photo"})
if artist_artwork is not None:
artist.artwork.add_data(artist_artwork.get("data-src", artist_artwork.get("src")))
artist.main_album_collection.append(self._parse_album(soup=subsoup, initial_source=source))
for i, data_blob_soup in enumerate(soup.find_all("div", {"id": ["pagedata", "collectors-data"]})):
data_blob = data_blob_soup["data-blob"]
@ -240,14 +247,14 @@ class Bandcamp(Page):
dump_to_file(f"bandcamp_artist_data_blob_{i}.json", data_blob, is_json=True, exit_after_dump=False)
if data_blob is not None:
artist.album_collection.extend(
artist.main_album_collection.extend(
self._parse_artist_data_blob(json.loads(data_blob), source.url)
)
artist.source_collection.append(source)
return artist
def _parse_track_element(self, track: dict, artwork: ArtworkCollection) -> Optional[Song]:
def _parse_track_element(self, track: dict, artwork: Artwork) -> Optional[Song]:
lyrics_list: List[Lyrics] = []
_lyrics: Optional[str] = track.get("item", {}).get("recordingOf", {}).get("lyrics", {}).get("text")
@ -264,7 +271,7 @@ class Bandcamp(Page):
def fetch_album(self, source: Source, stop_at_level: int = 1) -> Album:
album = Album()
r = self.connection.get(source.url, name=f"album_{urlparse(source.url).netloc.split('.')[0]}_{urlparse(source.url).path.replace('/', '').replace('album', '')}")
r = self.connection.get(source.url)
if r is None:
return album
@ -281,15 +288,9 @@ class Bandcamp(Page):
artist_source_list = []
if "@id" in artist_data:
artist_source_list = [Source(self.SOURCE_TYPE, _parse_artist_url(artist_data["@id"]))]
source_list: List[Source] = [source]
if "mainEntityOfPage" in data or "@id" in data:
source_list.append(Source(self.SOURCE_TYPE, data.get("mainEntityOfPage", data["@id"])))
album = Album(
title=data["name"].strip(),
source_list=source_list,
source_list=[Source(self.SOURCE_TYPE, data.get("mainEntityOfPage", data["@id"]))],
date=ID3Timestamp.strptime(data["datePublished"], "%d %b %Y %H:%M:%S %Z"),
artist_list=[Artist(
name=artist_data["name"].strip(),
@ -297,7 +298,7 @@ class Bandcamp(Page):
)]
)
artwork: ArtworkCollection = ArtworkCollection()
artwork: Artwork = Artwork()
def _get_artwork_url(_data: dict) -> Optional[str]:
if "image" in _data:
@ -308,14 +309,15 @@ class Bandcamp(Page):
_artwork_url = _get_artwork_url(data)
if _artwork_url is not None:
artwork.add_data(url=_artwork_url, width=350, height=350)
artwork.append(url=_artwork_url, width=350, height=350)
else:
for album_release in data.get("albumRelease", []):
_artwork_url = _get_artwork_url(album_release)
if _artwork_url is not None:
artwork.add_data(url=_artwork_url, width=350, height=350)
artwork.append(url=_artwork_url, width=350, height=350)
break
for i, track_json in enumerate(data.get("track", {}).get("itemListElement", [])):
if DEBUG:
dump_to_file(f"album_track_{i}.json", json.dumps(track_json), is_json=True, exit_after_dump=False)
@ -336,7 +338,7 @@ class Bandcamp(Page):
return []
def fetch_song(self, source: Source, stop_at_level: int = 1) -> Song:
r = self.connection.get(source.url, name=f"song_{urlparse(source.url).netloc.split('.')[0]}_{urlparse(source.url).path.replace('/', '').replace('track', '')}")
r = self.connection.get(source.url)
if r is None:
return Song()
@ -361,29 +363,17 @@ class Bandcamp(Page):
for key, value in other_data.get("trackinfo", [{}])[0].get("file", {"": None}).items():
mp3_url = value
source_list: List[Source] = [source]
if "mainEntityOfPage" in data or "@id" in data:
source_list.append(Source(self.SOURCE_TYPE, data.get("mainEntityOfPage", data["@id"]), audio_url=mp3_url))
source_list_album: List[Source] = [source]
if "@id" in album_data:
source_list_album.append(Source(self.SOURCE_TYPE, album_data["@id"]))
source_list_artist: List[Source] = [source]
if "@id" in artist_data:
source_list_artist.append(Source(self.SOURCE_TYPE, _parse_artist_url(artist_data["@id"])))
song = Song(
title=clean_song_title(data["name"], artist_name=artist_data["name"]),
source_list=source_list,
source_list=[source, Source(self.SOURCE_TYPE, data.get("mainEntityOfPage", data["@id"]), audio_url=mp3_url)],
album_list=[Album(
title=album_data["name"].strip(),
date=ID3Timestamp.strptime(data["datePublished"], "%d %b %Y %H:%M:%S %Z"),
source_list=source_list_album
source_list=[Source(self.SOURCE_TYPE, album_data["@id"])]
)],
artist_list=[Artist(
main_artist_list=[Artist(
name=artist_data["name"].strip(),
source_list=source_list_artist
source_list=[Source(self.SOURCE_TYPE, _parse_artist_url(artist_data["@id"]))]
)],
lyrics_list=self._fetch_lyrics(soup=soup)
)

View File

@ -7,7 +7,7 @@ from urllib.parse import urlparse, urlencode
from ..connection import Connection
from ..utils.config import logging_settings
from .abstract import Page
from ..utils.enums import SourceType, ALL_SOURCE_TYPES
from ..utils.enums.source import SourcePages
from ..utils.enums.album import AlbumType
from ..utils.support_classes.query import Query
from ..objects import (
@ -52,14 +52,14 @@ def _song_from_json(artist_html=None, album_html=None, release_type=None, title=
return Song(
title=title,
artist_list=[
main_artist_list=[
_artist_from_json(artist_html=artist_html)
],
album_list=[
_album_from_json(album_html=album_html, release_type=release_type, artist_html=artist_html)
],
source_list=[
Source(ALL_SOURCE_TYPES.ENCYCLOPAEDIA_METALLUM, song_id)
Source(SourcePages.ENCYCLOPAEDIA_METALLUM, song_id)
]
)
@ -85,7 +85,7 @@ def _artist_from_json(artist_html=None, genre=None, country=None) -> Artist:
return Artist(
name=artist_name,
source_list=[
Source(ALL_SOURCE_TYPES.ENCYCLOPAEDIA_METALLUM, artist_url)
Source(SourcePages.ENCYCLOPAEDIA_METALLUM, artist_url)
]
)
@ -105,7 +105,7 @@ def _album_from_json(album_html=None, release_type=None, artist_html=None) -> Al
title=album_name,
album_type=album_type,
source_list=[
Source(ALL_SOURCE_TYPES.ENCYCLOPAEDIA_METALLUM, album_url)
Source(SourcePages.ENCYCLOPAEDIA_METALLUM, album_url)
],
artist_list=[
_artist_from_json(artist_html=artist_html)
@ -207,7 +207,7 @@ def create_grid(
class EncyclopaediaMetallum(Page):
SOURCE_TYPE = ALL_SOURCE_TYPES.ENCYCLOPAEDIA_METALLUM
SOURCE_TYPE = SourcePages.ENCYCLOPAEDIA_METALLUM
LOGGER = logging_settings["metal_archives_logger"]
def __init__(self, **kwargs):
@ -266,7 +266,7 @@ class EncyclopaediaMetallum(Page):
song_title = song.title.strip()
album_titles = ["*"] if song.album_collection.empty else [album.title.strip() for album in song.album_collection]
artist_titles = ["*"] if song.artist_collection.empty else [artist.name.strip() for artist in song.artist_collection]
artist_titles = ["*"] if song.main_artist_collection.empty else [artist.name.strip() for artist in song.main_artist_collection]
search_results = []
@ -486,7 +486,7 @@ class EncyclopaediaMetallum(Page):
href = anchor["href"]
if href is not None:
source_list.append(Source.match_url(href, referrer_page=self.SOURCE_TYPE))
source_list.append(Source.match_url(href, referer_page=self.SOURCE_TYPE))
# The following code is only legacy code, which I just kep because it doesn't harm.
# The way ma returns sources changed.
@ -504,7 +504,7 @@ class EncyclopaediaMetallum(Page):
if url is None:
continue
source_list.append(Source.match_url(url, referrer_page=self.SOURCE_TYPE))
source_list.append(Source.match_url(url, referer_page=self.SOURCE_TYPE))
return source_list
@ -663,7 +663,7 @@ class EncyclopaediaMetallum(Page):
artist.notes = band_notes
discography: List[Album] = self._fetch_artist_discography(artist_id)
artist.album_collection.extend(discography)
artist.main_album_collection.extend(discography)
return artist
@ -832,7 +832,7 @@ class EncyclopaediaMetallum(Page):
)
def get_source_type(self, source: Source):
if self.SOURCE_TYPE != source.source_type:
if self.SOURCE_TYPE != source.page_enum:
return None
url = source.url

View File

@ -1,288 +0,0 @@
import simplejson as json
from json_unescape import escape_json, unescape_json
from enum import Enum
from typing import List, Optional, Type
from urllib.parse import urlencode, urlparse, urlunparse
import pycountry
from bs4 import BeautifulSoup
from ..connection import Connection
from ..objects import (Album, Artist, ArtworkCollection, Contact,
DatabaseObject, FormattedText, ID3Timestamp, Label,
Lyrics, Song, Source, SourceType, Target)
from ..utils import dump_to_file, traverse_json_path
from ..utils.config import logging_settings, main_settings
from ..utils.enums import ALL_SOURCE_TYPES, SourceType
from ..utils.shared import DEBUG
from ..utils.string_processing import clean_song_title
from ..utils.support_classes.download_result import DownloadResult
from .abstract import Page
if DEBUG:
from ..utils import dump_to_file
class Genius(Page):
SOURCE_TYPE = ALL_SOURCE_TYPES.GENIUS
HOST = "genius.com"
def __init__(self, *args, **kwargs):
self.connection: Connection = Connection(
host="https://genius.com/",
logger=self.LOGGER,
module="genius",
)
super().__init__(*args, **kwargs)
def get_source_type(self, source: Source) -> Optional[Type[DatabaseObject]]:
path = source.parsed_url.path.replace("/", "")
if path.startswith("artists"):
return Artist
if path.startswith("albums"):
return Album
return Song
def add_to_artwork(self, artwork: ArtworkCollection, url: str):
if url is None:
return
url_frags = url.split(".")
if len(url_frags) < 2:
artwork.add_data(url=url)
return
dimensions = url_frags[-2].split("x")
if len(dimensions) < 2:
artwork.add_data(url=url)
return
if len(dimensions) == 3:
dimensions = dimensions[:-1]
try:
artwork.add_data(url=url, width=int(dimensions[0]), height=int(dimensions[1]))
except ValueError:
artwork.add_data(url=url)
def parse_api_object(self, data: dict) -> Optional[DatabaseObject]:
if data is None:
return None
object_type = data.get("_type")
artwork = ArtworkCollection()
self.add_to_artwork(artwork, data.get("header_image_url"))
self.add_to_artwork(artwork, data.get("image_url"))
additional_sources: List[Source] = []
source: Source = Source(self.SOURCE_TYPE, data.get("url"), additional_data={
"id": data.get("id"),
"slug": data.get("slug"),
"api_path": data.get("api_path"),
})
notes = FormattedText()
description = data.get("description") or {}
if "html" in description:
notes.html = description["html"]
elif "markdown" in description:
notes.markdown = description["markdown"]
elif "description_preview" in data:
notes.plaintext = data["description_preview"]
if source.url is None:
return None
if object_type == "artist":
if data.get("instagram_name") is not None:
additional_sources.append(Source(ALL_SOURCE_TYPES.INSTAGRAM, f"https://www.instagram.com/{data['instagram_name']}/"))
if data.get("facebook_name") is not None:
additional_sources.append(Source(ALL_SOURCE_TYPES.FACEBOOK, f"https://www.facebook.com/{data['facebook_name']}/"))
if data.get("twitter_name") is not None:
additional_sources.append(Source(ALL_SOURCE_TYPES.TWITTER, f"https://x.com/{data['twitter_name']}/"))
return Artist(
name=data["name"].strip() if data.get("name") is not None else None,
source_list=[source],
artwork=artwork,
notes=notes,
)
if object_type == "album":
self.add_to_artwork(artwork, data.get("cover_art_thumbnail_url"))
self.add_to_artwork(artwork, data.get("cover_art_url"))
for cover_art in data.get("cover_arts", []):
self.add_to_artwork(artwork, cover_art.get("image_url"))
self.add_to_artwork(artwork, cover_art.get("thumbnail_image_url"))
return Album(
title=data.get("name").strip(),
source_list=[source],
artist_list=[self.parse_api_object(data.get("artist"))],
artwork=artwork,
date=ID3Timestamp(**(data.get("release_date_components") or {})),
)
if object_type == "song":
self.add_to_artwork(artwork, data.get("song_art_image_thumbnail_url"))
self.add_to_artwork(artwork, data.get("song_art_image_url"))
main_artist_list = []
featured_artist_list = []
_artist_name = None
primary_artist = self.parse_api_object(data.get("primary_artist"))
if primary_artist is not None:
_artist_name = primary_artist.name
main_artist_list.append(primary_artist)
for feature_artist in (*(data.get("featured_artists") or []), *(data.get("producer_artists") or []), *(data.get("writer_artists") or [])):
artist = self.parse_api_object(feature_artist)
if artist is not None:
featured_artist_list.append(artist)
return Song(
title=clean_song_title(data.get("title"), artist_name=_artist_name),
source_list=[source],
artwork=artwork,
feature_artist_list=featured_artist_list,
artist_list=main_artist_list,
)
return None
def general_search(self, search_query: str, **kwargs) -> List[DatabaseObject]:
results = []
search_params = {
"q": search_query,
}
r = self.connection.get("https://genius.com/api/search/multi?" + urlencode(search_params), name=f"search_{search_query}")
if r is None:
return results
dump_to_file("search_genius.json", r.text, is_json=True, exit_after_dump=False)
data = r.json()
for elements in traverse_json_path(data, "response.sections", default=[]):
hits = elements.get("hits", [])
for hit in hits:
parsed = self.parse_api_object(hit.get("result"))
if parsed is not None:
results.append(parsed)
return results
def fetch_artist(self, source: Source, stop_at_level: int = 1) -> Artist:
artist: Artist = Artist()
# https://genius.com/api/artists/24527/albums?page=1
r = self.connection.get(source.url, name=source.url)
if r is None:
return artist
soup = self.get_soup_from_response(r)
# find the content attribute in the meta tag which is contained in the head
data_container = soup.find("meta", {"itemprop": "page_data"})
if data_container is not None:
content = data_container["content"]
dump_to_file("genius_itemprop_artist.json", content, is_json=True, exit_after_dump=False)
data = json.loads(content)
artist = self.parse_api_object(data.get("artist"))
for e in (data.get("artist_albums") or []):
r = self.parse_api_object(e)
if not isinstance(r, Album):
continue
artist.album_collection.append(r)
for e in (data.get("artist_songs") or []):
r = self.parse_api_object(e)
if not isinstance(r, Song):
continue
"""
TODO
fetch the album for these songs, because the api doesn't
return them
"""
artist.album_collection.extend(r.album_collection)
artist.source_collection.append(source)
return artist
def fetch_album(self, source: Source, stop_at_level: int = 1) -> Album:
album: Album = Album()
# https://genius.com/api/artists/24527/albums?page=1
r = self.connection.get(source.url, name=source.url)
if r is None:
return album
soup = self.get_soup_from_response(r)
# find the content attribute in the meta tag which is contained in the head
data_container = soup.find("meta", {"itemprop": "page_data"})
if data_container is not None:
content = data_container["content"]
dump_to_file("genius_itemprop_album.json", content, is_json=True, exit_after_dump=False)
data = json.loads(content)
album = self.parse_api_object(data.get("album"))
for e in data.get("album_appearances", []):
r = self.parse_api_object(e.get("song"))
if not isinstance(r, Song):
continue
album.song_collection.append(r)
album.source_collection.append(source)
return album
def get_json_content_from_response(self, response, start: str, end: str) -> Optional[str]:
content = response.text
start_index = content.find(start)
if start_index < 0:
return None
start_index += len(start)
end_index = content.find(end, start_index)
if end_index < 0:
return None
return content[start_index:end_index]
def fetch_song(self, source: Source, stop_at_level: int = 1) -> Song:
song: Song = Song()
r = self.connection.get(source.url, name=source.url)
if r is None:
return song
# get the contents that are between `JSON.parse('` and `');`
content = self.get_json_content_from_response(r, start="window.__PRELOADED_STATE__ = JSON.parse('", end="');\n window.__APP_CONFIG__ = ")
if content is not None:
#IMPLEMENT FIX FROM HAZEL
content = escape_json(content)
data = json.loads(content)
lyrics_html = traverse_json_path(data, "songPage.lyricsData.body.html", default=None)
if lyrics_html is not None:
song.lyrics_collection.append(Lyrics(FormattedText(html=lyrics_html)))
dump_to_file("genius_song_script_json.json", content, is_json=True, exit_after_dump=False)
soup = self.get_soup_from_response(r)
for lyrics in soup.find_all("div", {"data-lyrics-container": "true"}):
lyrics_object = Lyrics(FormattedText(html=lyrics.prettify()))
song.lyrics_collection.append(lyrics_object)
song.source_collection.append(source)
return song

View File

@ -1,25 +1,33 @@
from collections import defaultdict
from dataclasses import dataclass
from enum import Enum
from typing import Any, Dict, Generator, List, Optional, Type, Union
from typing import List, Optional, Type, Union, Generator
from urllib.parse import urlparse
import pycountry
from bs4 import BeautifulSoup
from ..connection import Connection
from ..objects import (Album, Artist, DatabaseObject,
FormattedText, ID3Timestamp, Label, Lyrics, Song,
Source, Target)
from ..objects.artwork import (Artwork, ArtworkVariant, ArtworkCollection)
from ..utils import shared, string_processing
from ..utils.config import logging_settings, main_settings
from ..utils.enums import ALL_SOURCE_TYPES, SourceType
from ..utils.enums.album import AlbumStatus, AlbumType
from ..utils.string_processing import clean_song_title
from ..utils.support_classes.download_result import DownloadResult
from ..utils.support_classes.query import Query
from .abstract import Page
from ..utils.enums.source import SourcePages
from ..utils.enums.album import AlbumType, AlbumStatus
from ..objects import (
Artist,
Source,
Song,
Album,
ID3Timestamp,
FormattedText,
Label,
Target,
DatabaseObject,
Lyrics,
Artwork
)
from ..utils.config import logging_settings
from ..utils import string_processing, shared
from ..utils.support_classes.query import Query
from ..utils.support_classes.download_result import DownloadResult
"""
https://musify.club/artist/ghost-bath-280348?_pjax=#bodyContent
@ -102,7 +110,9 @@ def parse_url(url: str) -> MusifyUrl:
class Musify(Page):
SOURCE_TYPE = ALL_SOURCE_TYPES.MUSIFY
# CHANGE
SOURCE_TYPE = SourcePages.MUSIFY
LOGGER = logging_settings["musify_logger"]
HOST = "https://musify.club"
@ -110,7 +120,6 @@ class Musify(Page):
self.connection: Connection = Connection(
host="https://musify.club/",
logger=self.LOGGER,
module="musify",
)
self.stream_connection: Connection = Connection(
@ -346,11 +355,9 @@ class Musify(Page):
if raw_id.isdigit():
_id = raw_id
return Song(
title=clean_song_title(song_title, artist_name=artist_list[0].name if len(artist_list) > 0 else None),
feature_artist_list=artist_list,
title=song_title,
main_artist_list=artist_list,
source_list=source_list
)
@ -365,7 +372,7 @@ class Musify(Page):
def general_search(self, search_query: str) -> List[DatabaseObject]:
search_results = []
r = self.connection.get(f"https://musify.club/search?searchText={search_query}", name="search_" + search_query)
r = self.connection.get(f"https://musify.club/search?searchText={search_query}")
if r is None:
return []
search_soup: BeautifulSoup = self.get_soup_from_response(r)
@ -383,11 +390,10 @@ class Musify(Page):
return search_results
def fetch_song(self, source: Source, stop_at_level: int = 1) -> Song:
musify_url = parse_url(source.url)
r = self.connection.get(source.url, name="track_" + musify_url.name_with_id)
# https://musify.club/track/linkin-park-numb-210765
r = self.connection.get(source.url)
if r is None:
return Song()
return Song
soup = self.get_soup_from_response(r)
@ -407,10 +413,6 @@ class Musify(Page):
href = artist_soup["href"]
if href is not None:
href_parts = href.split("/")
if len(href_parts) <= 1 or href_parts[-2] != "artist":
return
artist_src_list.append(Source(self.SOURCE_TYPE, self.HOST + href))
name_elem: BeautifulSoup = artist_soup.find("span", {"itemprop": "name"})
@ -476,11 +478,11 @@ class Musify(Page):
track_name = list_points[4].text.strip()
# album artwork
artwork: ArtworkCollection = ArtworkCollection()
# artwork
artwork: Artwork = Artwork()
album_image_element_list: List[BeautifulSoup] = soup.find_all("img", {"class": "album-img"})
for album_image_element in album_image_element_list:
artwork.add_data(url=album_image_element.get("data-src", album_image_element.get("src")))
artwork.append(url=album_image_element.get("data-src", album_image_element.get("src")))
# lyrics
lyrics_container: List[BeautifulSoup] = soup.find_all("div", {"id": "tabLyrics"})
@ -493,26 +495,17 @@ class Musify(Page):
for video_container in video_container_list:
iframe_list: List[BeautifulSoup] = video_container.findAll("iframe")
for iframe in iframe_list:
"""
the url could look like this
https://www.youtube.com/embed/sNObCkhzOYA?si=dNVgnZMBNVlNb0P_
"""
parsed_url = urlparse(iframe["src"])
path_parts = parsed_url.path.strip("/").split("/")
if path_parts[0] != "embed" or len(path_parts) < 2:
continue
source_list.append(Source(
ALL_SOURCE_TYPES.YOUTUBE,
f"https://music.youtube.com/watch?v={path_parts[1]}",
referrer_page=self.SOURCE_TYPE
SourcePages.YOUTUBE,
iframe["src"],
referer_page=self.SOURCE_TYPE
))
return Song(
title=clean_song_title(track_name, artist_name=artist_list[0].name if len(artist_list) > 0 else None),
title=track_name,
source_list=source_list,
lyrics_list=lyrics_list,
feature_artist_list=artist_list,
main_artist_list=artist_list,
album_list=album_list,
artwork=artwork,
)
@ -652,113 +645,12 @@ class Musify(Page):
))
return Song(
title=clean_song_title(song_name, artist_name=artist_list[0].name if len(artist_list) > 0 else None),
title=song_name,
tracksort=tracksort,
feature_artist_list=artist_list,
main_artist_list=artist_list,
source_list=source_list
)
def _parse_album(self, soup: BeautifulSoup) -> Album:
name: str = None
source_list: List[Source] = []
artist_list: List[Artist] = []
date: ID3Timestamp = None
"""
if breadcrumb list has 4 elements, then
the -2 is the artist link,
the -1 is the album
"""
# breadcrumb
breadcrumb_soup: BeautifulSoup = soup.find("ol", {"class", "breadcrumb"})
breadcrumb_elements: List[BeautifulSoup] = breadcrumb_soup.find_all("li", {"class": "breadcrumb-item"})
if len(breadcrumb_elements) == 4:
# album
album_crumb: BeautifulSoup = breadcrumb_elements[-1]
name = album_crumb.text.strip()
# artist
artist_crumb: BeautifulSoup = breadcrumb_elements[-2]
anchor: BeautifulSoup = artist_crumb.find("a")
if anchor is not None:
href = anchor.get("href")
href_parts = href.split("/")
if not(len(href_parts) <= 1 or href_parts[-2] != "artist"):
artist_source_list: List[Source] = []
if href is not None:
artist_source_list.append(Source(self.SOURCE_TYPE, self.HOST + href.strip()))
span: BeautifulSoup = anchor.find("span")
if span is not None:
artist_list.append(Artist(
name=span.get_text(strip=True),
source_list=artist_source_list
))
else:
self.LOGGER.debug("there are not 4 breadcrumb items, which shouldn't be the case")
# meta
meta_url: BeautifulSoup = soup.find("meta", {"itemprop": "url"})
if meta_url is not None:
url = meta_url.get("content")
if url is not None:
source_list.append(Source(self.SOURCE_TYPE, self.HOST + url))
meta_name: BeautifulSoup = soup.find("meta", {"itemprop": "name"})
if meta_name is not None:
_name = meta_name.get("content")
if _name is not None:
name = _name
# album info
album_info_ul: BeautifulSoup = soup.find("ul", {"class": "album-info"})
if album_info_ul is not None:
artist_anchor: BeautifulSoup
for artist_anchor in album_info_ul.find_all("a", {"itemprop": "byArtist"}):
# line 98
artist_source_list: List[Source] = []
artist_url_meta = artist_anchor.find("meta", {"itemprop": "url"})
if artist_url_meta is not None:
artist_href = artist_url_meta.get("content")
if artist_href is not None:
artist_source_list.append(Source(self.SOURCE_TYPE, url=self.HOST + artist_href))
artist_meta_name = artist_anchor.find("meta", {"itemprop": "name"})
if artist_meta_name is not None:
artist_name = artist_meta_name.get("content")
if artist_name is not None:
artist_list.append(Artist(
name=artist_name,
source_list=artist_source_list
))
time_soup: BeautifulSoup = album_info_ul.find("time", {"itemprop": "datePublished"})
if time_soup is not None:
raw_datetime = time_soup.get("datetime")
if raw_datetime is not None:
try:
date = ID3Timestamp.strptime(raw_datetime, "%Y-%m-%d")
except ValueError:
self.LOGGER.debug(f"Raw datetime doesn't match time format %Y-%m-%d: {raw_datetime}")
# album artwork
album_artwork: ArtworkCollection = ArtworkCollection()
album_artwork_list: List[BeautifulSoup] = soup.find_all("img", {"class":"artist-img"})
for album_artwork in album_artwork_list:
album_artwork.add_data(url=album_artwork.get("data-src", album_artwork.get("src")))
return Album(
title=name,
source_list=source_list,
artist_list=artist_list,
date=date,
artwork=album_artwork
)
def fetch_album(self, source: Source, stop_at_level: int = 1) -> Album:
"""
fetches album from source:
@ -777,7 +669,7 @@ class Musify(Page):
url = parse_url(source.url)
endpoint = self.HOST + "/release/" + url.name_with_id
r = self.connection.get(endpoint, name=url.name_with_id)
r = self.connection.get(endpoint)
if r is None:
return Album()
@ -793,20 +685,30 @@ class Musify(Page):
new_song = self._parse_song_card(card_soup)
album.song_collection.append(new_song)
if stop_at_level > 1:
song: Song
for song in album.song_collection:
sources = song.source_collection.get_sources_from_page(self.SOURCE_TYPE)
for source in sources:
song.merge(self.fetch_song(source=source))
album.update_tracksort()
return album
def _fetch_initial_artist(self, url: MusifyUrl, source: Source, **kwargs) -> Artist:
def _get_artist_attributes(self, url: MusifyUrl) -> Artist:
"""
fetches the main Artist attributes from this endpoint
https://musify.club/artist/ghost-bath-280348?_pjax=#bodyContent
it needs to parse html
:param url:
:return:
"""
r = self.connection.get(f"https://musify.club/{url.source_type.value}/{url.name_with_id}?_pjax=#bodyContent", name="artist_attributes_" + url.name_with_id)
r = self.connection.get(f"https://musify.club/{url.source_type.value}/{url.name_with_id}?_pjax=#bodyContent")
if r is None:
return Artist(source_list=[source])
return Artist()
soup = self.get_soup_from_response(r)
@ -905,7 +807,7 @@ class Musify(Page):
href = additional_source.get("href")
if href is None:
continue
new_src = Source.match_url(href, referrer_page=self.SOURCE_TYPE)
new_src = Source.match_url(href, referer_page=self.SOURCE_TYPE)
if new_src is None:
continue
source_list.append(new_src)
@ -914,21 +816,14 @@ class Musify(Page):
if note_soup is not None:
notes.html = note_soup.decode_contents()
# get artist profile artwork
main_artist_artwork: ArtworkCollection = ArtworkCollection()
artist_image_element_list: List[BeautifulSoup] = soup.find_all("img", {"class":"artist-img"})
for artist_image_element in artist_image_element_list:
main_artist_artwork.add_data(url=artist_image_element.get("data-src", artist_image_element.get("src")))
return Artist(
name=name,
country=country,
source_list=source_list,
notes=notes,
artwork=main_artist_artwork
notes=notes
)
def _parse_album_card(self, album_card: BeautifulSoup, artist_name: str = None, **kwargs) -> Album:
def _parse_album_card(self, album_card: BeautifulSoup, artist_name: str = None) -> Album:
"""
<div class="card release-thumbnail" data-type="2">
<a href="/release/ghost-bath-self-loather-2021-1554266">
@ -952,20 +847,46 @@ class Musify(Page):
</div>
"""
album_kwargs: Dict[str, Any] = {
"source_list": [],
}
_id: Optional[str] = None
name: str = None
source_list: List[Source] = []
timestamp: Optional[ID3Timestamp] = None
album_status = None
def set_name(new_name: str):
nonlocal name
nonlocal artist_name
# example of just setting not working:
# https://musify.club/release/unjoy-eurythmie-psychonaut-4-tired-numb-still-alive-2012-324067
if new_name.count(" - ") != 1:
name = new_name
return
potential_artist_list, potential_name = new_name.split(" - ")
unified_artist_list = string_processing.unify(potential_artist_list)
if artist_name is not None:
if string_processing.unify(artist_name) not in unified_artist_list:
name = new_name
return
name = potential_name
return
name = new_name
album_status_id = album_card.get("data-type")
if album_status_id.isdigit():
album_status_id = int(album_status_id)
album_kwargs["album_type"] = ALBUM_TYPE_MAP[album_status_id]
album_type = ALBUM_TYPE_MAP[album_status_id]
if album_status_id == 5:
album_kwargs["album_status"] = AlbumStatus.BOOTLEG
album_status = AlbumStatus.BOOTLEG
def parse_release_anchor(_anchor: BeautifulSoup, text_is_name=False):
nonlocal album_kwargs
nonlocal _id
nonlocal name
nonlocal source_list
if _anchor is None:
return
@ -973,13 +894,20 @@ class Musify(Page):
href = _anchor.get("href")
if href is not None:
# add url to sources
album_kwargs["source_list"].append(Source(
source_list.append(Source(
self.SOURCE_TYPE,
self.HOST + href
))
if text_is_name:
album_kwargs["title"] = clean_song_title(_anchor.text, artist_name)
# split id from url
split_href = href.split("-")
if len(split_href) > 1:
_id = split_href[-1]
if not text_is_name:
return
set_name(_anchor.text)
anchor_list = album_card.find_all("a", recursive=False)
if len(anchor_list) > 0:
@ -990,7 +918,7 @@ class Musify(Page):
if thumbnail is not None:
alt = thumbnail.get("alt")
if alt is not None:
album_kwargs["title"] = clean_song_title(alt, artist_name)
set_name(alt)
image_url = thumbnail.get("src")
else:
@ -1007,7 +935,7 @@ class Musify(Page):
13.11.2021
</small>
"""
nonlocal album_kwargs
nonlocal timestamp
italic_tagging_soup: BeautifulSoup = small_soup.find("i")
if italic_tagging_soup is None:
@ -1017,7 +945,7 @@ class Musify(Page):
return
raw_time = small_soup.text.strip()
album_kwargs["date"] = ID3Timestamp.strptime(raw_time, "%d.%m.%Y")
timestamp = ID3Timestamp.strptime(raw_time, "%d.%m.%Y")
# parse small date
card_footer_list = album_card.find_all("div", {"class": "card-footer"})
@ -1030,9 +958,105 @@ class Musify(Page):
else:
self.LOGGER.debug("there is not even 1 footer in the album card")
return Album(**album_kwargs)
return Album(
title=name,
source_list=source_list,
date=timestamp,
album_type=album_type,
album_status=album_status
)
def _fetch_artist_discography(self, artist: Artist, url: MusifyUrl, artist_name: str = None, **kwargs):
def _parse_album(self, soup: BeautifulSoup) -> Album:
name: str = None
source_list: List[Source] = []
artist_list: List[Artist] = []
date: ID3Timestamp = None
"""
if breadcrumb list has 4 elements, then
the -2 is the artist link,
the -1 is the album
"""
# breadcrumb
breadcrumb_soup: BeautifulSoup = soup.find("ol", {"class", "breadcrumb"})
breadcrumb_elements: List[BeautifulSoup] = breadcrumb_soup.find_all("li", {"class": "breadcrumb-item"})
if len(breadcrumb_elements) == 4:
# album
album_crumb: BeautifulSoup = breadcrumb_elements[-1]
name = album_crumb.text.strip()
# artist
artist_crumb: BeautifulSoup = breadcrumb_elements[-2]
anchor: BeautifulSoup = artist_crumb.find("a")
if anchor is not None:
href = anchor.get("href")
artist_source_list: List[Source] = []
if href is not None:
artist_source_list.append(Source(self.SOURCE_TYPE, self.HOST + href.strip()))
span: BeautifulSoup = anchor.find("span")
if span is not None:
artist_list.append(Artist(
name=span.get_text(strip=True),
source_list=artist_source_list
))
else:
self.LOGGER.debug("there are not 4 breadcrumb items, which shouldn't be the case")
# meta
meta_url: BeautifulSoup = soup.find("meta", {"itemprop": "url"})
if meta_url is not None:
url = meta_url.get("content")
if url is not None:
source_list.append(Source(self.SOURCE_TYPE, self.HOST + url))
meta_name: BeautifulSoup = soup.find("meta", {"itemprop": "name"})
if meta_name is not None:
_name = meta_name.get("content")
if _name is not None:
name = _name
# album info
album_info_ul: BeautifulSoup = soup.find("ul", {"class": "album-info"})
if album_info_ul is not None:
artist_anchor: BeautifulSoup
for artist_anchor in album_info_ul.find_all("a", {"itemprop": "byArtist"}):
# line 98
artist_source_list: List[Source] = []
artist_url_meta = artist_anchor.find("meta", {"itemprop": "url"})
if artist_url_meta is not None:
artist_href = artist_url_meta.get("content")
if artist_href is not None:
artist_source_list.append(Source(self.SOURCE_TYPE, url=self.HOST + artist_href))
artist_meta_name = artist_anchor.find("meta", {"itemprop": "name"})
if artist_meta_name is not None:
artist_name = artist_meta_name.get("content")
if artist_name is not None:
artist_list.append(Artist(
name=artist_name,
source_list=artist_source_list
))
time_soup: BeautifulSoup = album_info_ul.find("time", {"itemprop": "datePublished"})
if time_soup is not None:
raw_datetime = time_soup.get("datetime")
if raw_datetime is not None:
try:
date = ID3Timestamp.strptime(raw_datetime, "%Y-%m-%d")
except ValueError:
self.LOGGER.debug(f"Raw datetime doesn't match time format %Y-%m-%d: {raw_datetime}")
return Album(
title=name,
source_list=source_list,
artist_list=artist_list,
date=date
)
def _get_discography(self, url: MusifyUrl, artist_name: str = None, stop_at_level: int = 1) -> Generator[Album, None, None]:
"""
POST https://musify.club/artist/filteralbums
ArtistID: 280348
@ -1040,8 +1064,6 @@ class Musify(Page):
SortOrder.IsAscending: false
X-Requested-With: XMLHttpRequest
"""
_download_all = kwargs.get("download_all", False)
_album_type_blacklist = kwargs.get("album_type_blacklist", main_settings["album_type_blacklist"])
endpoint = self.HOST + "/" + url.source_type.value + "/filteralbums"
@ -1050,42 +1072,36 @@ class Musify(Page):
"SortOrder.Property": "dateCreated",
"SortOrder.IsAscending": False,
"X-Requested-With": "XMLHttpRequest"
}, name="discography_" + url.name_with_id)
})
if r is None:
return
soup: BeautifulSoup = self.get_soup_from_response(r)
return []
soup: BeautifulSoup = BeautifulSoup(r.content, features="html.parser")
for card_soup in soup.find_all("div", {"class": "card"}):
album = self._parse_album_card(card_soup, artist_name, **kwargs)
if not self.fetch_options.download_all and album.album_type in self.fetch_options.album_type_blacklist:
continue
yield self._parse_album_card(card_soup, artist_name)
artist.album_collection.append(album)
def _fetch_artist_artwork(self, source: str, artist: Artist, **kwargs):
# artist artwork
artwork_gallery = self.get_soup_from_response(self.connection.get(source.strip().strip("/") + "/photos"))
if artwork_gallery is not None:
gallery_body_content: BeautifulSoup = artwork_gallery.find(id="bodyContent")
gallery_image_element_list: List[BeautifulSoup] = gallery_body_content.find_all("img")
for gallery_image_element in gallery_image_element_list:
artist.artwork.append(ArtworkVariant(url=gallery_image_element.get("data-src", gallery_image_element.get("src")), width=247, heigth=247))
def fetch_artist(self, source: Source, **kwargs) -> Artist:
def fetch_artist(self, source: Source, stop_at_level: int = 1) -> Artist:
"""
TODO
fetches artist from source
[x] discography
[x] attributes
[x] picture gallery
[] picture gallery
Args:
source (Source): the source to fetch
stop_at_level: int = 1: if it is false, every album from discograohy will be fetched. Defaults to False.
Returns:
Artist: the artist fetched
"""
url = parse_url(source.url)
artist = self._fetch_initial_artist(url, source=source, **kwargs)
self._fetch_artist_discography(artist, url, artist.name, **kwargs)
self._fetch_artist_artwork(url.url, artist, **kwargs)
artist = self._get_artist_attributes(url)
artist.main_album_collection.extend(self._get_discography(url, artist.name))
return artist
def fetch_label(self, source: Source, stop_at_level: int = 1) -> Label:
@ -1107,4 +1123,4 @@ class Musify(Page):
self.LOGGER.warning(f"The source has no audio link. Falling back to {endpoint}.")
return self.stream_connection.stream_into(endpoint, target, raw_url=True, exclude_headers=["Host"], name=desc)
return self.stream_connection.stream_into(endpoint, target, raw_url=True, exclude_headers=["Host"])

View File

@ -0,0 +1,65 @@
from typing import List, Optional, Type
from urllib.parse import urlparse
import logging
from ..objects import Source, DatabaseObject
from .abstract import Page
from ..objects import (
Artist,
Source,
SourcePages,
Song,
Album,
Label,
Target
)
from ..connection import Connection
from ..utils.support_classes.query import Query
from ..utils.support_classes.download_result import DownloadResult
class Preset(Page):
# CHANGE
SOURCE_TYPE = SourcePages.PRESET
LOGGER = logging.getLogger("preset")
def __init__(self, *args, **kwargs):
self.connection: Connection = Connection(
host="https://www.preset.cum/",
logger=self.LOGGER
)
super().__init__(*args, **kwargs)
def get_source_type(self, source: Source) -> Optional[Type[DatabaseObject]]:
return super().get_source_type(source)
def general_search(self, search_query: str) -> List[DatabaseObject]:
return []
def label_search(self, label: Label) -> List[Label]:
return []
def artist_search(self, artist: Artist) -> List[Artist]:
return []
def album_search(self, album: Album) -> List[Album]:
return []
def song_search(self, song: Song) -> List[Song]:
return []
def fetch_song(self, source: Source, stop_at_level: int = 1) -> Song:
return Song()
def fetch_album(self, source: Source, stop_at_level: int = 1) -> Album:
return Album()
def fetch_artist(self, source: Source, stop_at_level: int = 1) -> Artist:
return Artist()
def fetch_label(self, source: Source, stop_at_level: int = 1) -> Label:
return Label()
def download_song_to_target(self, source: Source, target: Target, desc: str = None) -> DownloadResult:
return DownloadResult()

View File

@ -2,13 +2,15 @@ from typing import List, Optional, Type, Tuple
from urllib.parse import urlparse, urlunparse, parse_qs
from enum import Enum
import python_sponsorblock
import sponsorblock
from sponsorblock.errors import HTTPException, NotFoundException
from ..objects import Source, DatabaseObject, Song, Target
from .abstract import Page
from ..objects import (
Artist,
Source,
SourcePages,
Song,
Album,
Label,
@ -18,7 +20,6 @@ from ..objects import (
)
from ..connection import Connection
from ..utils.string_processing import clean_song_title
from ..utils.enums import SourceType, ALL_SOURCE_TYPES
from ..utils.support_classes.download_result import DownloadResult
from ..utils.config import youtube_settings, main_settings, logging_settings
@ -39,7 +40,10 @@ def get_piped_url(path: str = "", params: str = "", query: str = "", fragment: s
class YouTube(SuperYouTube):
# CHANGE
SOURCE_TYPE = ALL_SOURCE_TYPES.YOUTUBE
SOURCE_TYPE = SourcePages.YOUTUBE
LOGGER = logging_settings["youtube_logger"]
NO_ADDITIONAL_DATA_FROM_SONG = True
def __init__(self, *args, **kwargs):
self.connection: Connection = Connection(
@ -59,9 +63,8 @@ class YouTube(SuperYouTube):
)
# the stuff with the connection is, to ensure sponsorblock uses the proxies, my programm does
_sponsorblock_connection: Connection = Connection()
self.sponsorblock = python_sponsorblock.SponsorBlock(silent=True, session=_sponsorblock_connection.session)
_sponsorblock_connection: Connection = Connection(host="https://sponsor.ajay.app/")
self.sponsorblock_client = sponsorblock.Client(session=_sponsorblock_connection.session)
super().__init__(*args, **kwargs)
@ -143,7 +146,7 @@ class YouTube(SuperYouTube):
self.SOURCE_TYPE, get_invidious_url(path="/watch", query=f"v={data['videoId']}")
)],
notes=FormattedText(html=data["descriptionHtml"] + f"\n<p>{license_str}</ p>" ),
artist_list=artist_list
main_artist_list=artist_list
), int(data["published"])
def fetch_song(self, source: Source, stop_at_level: int = 1) -> Song:
@ -284,7 +287,7 @@ class YouTube(SuperYouTube):
self.LOGGER.warning(f"didn't found any playlists with piped, falling back to invidious. (it is unusual)")
album_list, artist_name = self.fetch_invidious_album_list(parsed.id)
return Artist(name=artist_name, album_list=album_list, source_list=[source])
return Artist(name=artist_name, main_album_list=album_list, source_list=[source])
def download_song_to_target(self, source: Source, target: Target, desc: str = None) -> DownloadResult:
"""
@ -341,10 +344,10 @@ class YouTube(SuperYouTube):
segments = []
try:
segments = self.sponsorblock.get_segments(parsed.id)
segments = self.sponsorblock_client.get_skip_segments(parsed.id)
except NotFoundException:
self.LOGGER.debug(f"No sponsor found for the video {parsed.id}.")
except HTTPException as e:
self.LOGGER.warning(f"{e}")
return [(segment.segment[0], segment.segment[1]) for segment in segments]
return [(segment.start, segment.end) for segment in segments]

View File

@ -7,6 +7,7 @@ from ..abstract import Page
from ...objects import (
Artist,
Source,
SourcePages,
Song,
Album,
Label,
@ -24,6 +25,7 @@ def music_card_shelf_renderer(renderer: dict) -> List[DatabaseObject]:
results.extend(parse_renderer(sub_renderer))
return results
def music_responsive_list_item_flex_column_renderer(renderer: dict) -> List[DatabaseObject]:
return parse_run_list(renderer.get("text", {}).get("runs", []))
@ -52,24 +54,19 @@ def music_responsive_list_item_renderer(renderer: dict) -> List[DatabaseObject]:
for result in results:
_map[type(result)].append(result)
if len(song_list) == 1:
song = song_list[0]
song.feature_artist_collection.extend(artist_list)
for song in song_list:
song.album_collection.extend(album_list)
return [song]
song.main_artist_collection.extend(artist_list)
if len(album_list) == 1:
album = album_list[0]
for album in album_list:
album.artist_collection.extend(artist_list)
album.song_collection.extend(song_list)
return [album]
"""
if len(artist_list) == 1:
artist = artist_list[0]
artist.main_album_collection.extend(album_list)
return [artist]
"""
if len(song_list) > 0:
return song_list
if len(album_list) > 0:
return album_list
if len(artist_list) > 0:
return artist_list
return results

View File

@ -2,14 +2,12 @@ from typing import List, Optional
from enum import Enum
from ...utils.config import youtube_settings, logging_settings
from ...utils.string_processing import clean_song_title
from ...utils.enums import SourceType, ALL_SOURCE_TYPES
from ...objects import Source, DatabaseObject
from ..abstract import Page
from ...objects import (
Artist,
Source,
SourcePages,
Song,
Album,
Label,
@ -19,7 +17,7 @@ from ...objects import (
LOGGER = logging_settings["youtube_music_logger"]
SOURCE_PAGE = ALL_SOURCE_TYPES.YOUTUBE
SOURCE_PAGE = SourcePages.YOUTUBE_MUSIC
class PageType(Enum):
@ -41,7 +39,7 @@ def parse_run_element(run_element: dict) -> Optional[DatabaseObject]:
_temp_nav = run_element.get("navigationEndpoint", {})
is_video = "watchEndpoint" in _temp_nav
navigation_endpoint = _temp_nav.get("watchEndpoint", _temp_nav.get("browseEndpoint", {}))
navigation_endpoint = _temp_nav.get("watchEndpoint" if is_video else "browseEndpoint", {})
element_type = PageType.SONG
page_type_string = navigation_endpoint.get("watchEndpointMusicSupportedConfigs", {}).get("watchEndpointMusicConfig", {}).get("musicVideoType", "")
@ -52,7 +50,7 @@ def parse_run_element(run_element: dict) -> Optional[DatabaseObject]:
except ValueError:
return
element_id = navigation_endpoint.get("videoId", navigation_endpoint.get("browseId"))
element_id = navigation_endpoint.get("videoId" if is_video else "browseId")
element_text = run_element.get("text")
if element_id is None or element_text is None:
@ -61,11 +59,7 @@ def parse_run_element(run_element: dict) -> Optional[DatabaseObject]:
if element_type == PageType.SONG or (element_type == PageType.VIDEO and not youtube_settings["youtube_music_clean_data"]) or (element_type == PageType.OFFICIAL_MUSIC_VIDEO and not youtube_settings["youtube_music_clean_data"]):
source = Source(SOURCE_PAGE, f"https://music.youtube.com/watch?v={element_id}")
return Song(
title=clean_song_title(element_text),
source_list=[source]
)
return Song(title=element_text, source_list=[source])
if element_type == PageType.ARTIST or (element_type == PageType.CHANNEL and not youtube_settings["youtube_music_clean_data"]):
source = Source(SOURCE_PAGE, f"https://music.youtube.com/channel/{element_id}")

View File

@ -3,13 +3,15 @@ from urllib.parse import urlparse, urlunparse, parse_qs
from enum import Enum
import requests
import python_sponsorblock
import sponsorblock
from sponsorblock.errors import HTTPException, NotFoundException
from ...objects import Source, DatabaseObject, Song, Target
from ..abstract import Page
from ...objects import (
Artist,
Source,
SourcePages,
Song,
Album,
Label,
@ -20,7 +22,6 @@ from ...objects import (
from ...connection import Connection
from ...utils.support_classes.download_result import DownloadResult
from ...utils.config import youtube_settings, logging_settings, main_settings
from ...utils.enums import SourceType, ALL_SOURCE_TYPES
def get_invidious_url(path: str = "", params: str = "", query: str = "", fragment: str = "") -> str:
@ -50,7 +51,7 @@ class YouTubeUrl:
"""
def __init__(self, url: str) -> None:
self.SOURCE_TYPE = ALL_SOURCE_TYPES.YOUTUBE
self.SOURCE_TYPE = SourcePages.YOUTUBE
"""
Raises Index exception for wrong url, and value error for not found enum type
@ -58,6 +59,9 @@ class YouTubeUrl:
self.id = ""
parsed = urlparse(url=url)
if parsed.netloc == "music.youtube.com":
self.SOURCE_TYPE = SourcePages.YOUTUBE_MUSIC
self.url_type: YouTubeUrlType
type_frag_list = parsed.path.split("/")
@ -121,7 +125,8 @@ class YouTubeUrl:
class SuperYouTube(Page):
# CHANGE
SOURCE_TYPE = ALL_SOURCE_TYPES.YOUTUBE
SOURCE_TYPE = SourcePages.YOUTUBE
LOGGER = logging_settings["youtube_logger"]
NO_ADDITIONAL_DATA_FROM_SONG = False
@ -138,10 +143,9 @@ class SuperYouTube(Page):
)
# the stuff with the connection is, to ensure sponsorblock uses the proxies, my programm does
_sponsorblock_connection: Connection = Connection()
self.sponsorblock = python_sponsorblock.SponsorBlock(silent=True, session=_sponsorblock_connection.session)
_sponsorblock_connection: Connection = Connection(host="https://sponsor.ajay.app/")
self.sponsorblock_client = sponsorblock.Client(session=_sponsorblock_connection.session)
super().__init__(*args, **kwargs)
def get_source_type(self, source: Source) -> Optional[Type[DatabaseObject]]:
_url_type = {
@ -209,10 +213,10 @@ class SuperYouTube(Page):
segments = []
try:
segments = self.sponsorblock.get_segments(parsed.id)
segments = self.sponsorblock_client.get_skip_segments(parsed.id)
except NotFoundException:
self.LOGGER.debug(f"No sponsor found for the video {parsed.id}.")
except HTTPException as e:
self.LOGGER.warning(f"{e}")
return [(segment.segment[0], segment.segment[1]) for segment in segments]
return [(segment.start, segment.end) for segment in segments]

View File

@ -1,35 +1,41 @@
from __future__ import annotations, unicode_literals
from __future__ import unicode_literals, annotations
import json
from typing import Dict, List, Optional, Set, Type
from urllib.parse import urlparse, urlunparse, quote, parse_qs, urlencode
import logging
import random
import re
from collections import defaultdict
import json
from dataclasses import dataclass
import re
from functools import lru_cache
from typing import Dict, List, Optional, Set, Type
from urllib.parse import parse_qs, quote, urlencode, urlparse, urlunparse
import youtube_dl
from youtube_dl.extractor.youtube import YoutubeIE
from youtube_dl.utils import DownloadError
from ...connection import Connection
from ...objects import Album, Artist, ArtworkCollection
from ...objects import DatabaseObject as DataObject
from ...objects import (FormattedText, ID3Timestamp, Label, Lyrics, Song,
Source, Target)
from ...utils import dump_to_file, get_current_millis, traverse_json_path
from ...utils.config import logging_settings, main_settings, youtube_settings
from ...utils.enums import ALL_SOURCE_TYPES, SourceType
from ...utils.enums.album import AlbumType
from ...utils.exception.config import SettingValueError
from ...utils.config import main_settings, youtube_settings, logging_settings
from ...utils.shared import DEBUG, DEBUG_YOUTUBE_INITIALIZING
from ...utils.string_processing import clean_song_title
from ...utils.support_classes.download_result import DownloadResult
from ...utils import get_current_millis
from ...utils import dump_to_file
from ...objects import Source, DatabaseObject, ID3Timestamp, Artwork
from ..abstract import Page
from ...objects import (
Artist,
Source,
SourcePages,
Song,
Album,
Label,
Target
)
from ...connection import Connection
from ...utils.support_classes.download_result import DownloadResult
from ._list_render import parse_renderer
from ._music_object_render import parse_run_element
from .super_youtube import SuperYouTube
@ -156,21 +162,16 @@ class MusicKrakenYoutubeIE(YoutubeIE):
ALBUM_TYPE_MAP = {
"Single": AlbumType.SINGLE,
"Album": AlbumType.STUDIO_ALBUM,
"EP": AlbumType.EP,
}
class YoutubeMusic(SuperYouTube):
# CHANGE
SOURCE_TYPE = ALL_SOURCE_TYPES.YOUTUBE
SOURCE_TYPE = SourcePages.YOUTUBE_MUSIC
LOGGER = logging_settings["youtube_music_logger"]
def __init__(self, *args, ydl_opts: dict = None, **kwargs):
self.yt_music_connection: YoutubeMusicConnection = YoutubeMusicConnection(
logger=self.LOGGER,
accept_language="en-US,en;q=0.5",
accept_language="en-US,en;q=0.5"
)
self.credentials: YouTubeMusicCredentials = YouTubeMusicCredentials(
api_key=youtube_settings["youtube_music_api_key"],
@ -181,6 +182,7 @@ class YoutubeMusic(SuperYouTube):
self.start_millis = get_current_millis()
if self.credentials.api_key == "" or DEBUG_YOUTUBE_INITIALIZING:
self._fetch_from_main_page()
SuperYouTube.__init__(self, *args, **kwargs)
@ -202,8 +204,6 @@ class YoutubeMusic(SuperYouTube):
self.download_values_by_url: dict = {}
self.not_download: Dict[str, DownloadError] = {}
super().__init__(*args, **kwargs)
def _fetch_from_main_page(self):
"""
===API=KEY===
@ -212,7 +212,7 @@ class YoutubeMusic(SuperYouTube):
search for: "innertubeApiKey"
"""
r = self.yt_music_connection.get("https://music.youtube.com/", name="youtube_music_index.html", disable_cache=True, enable_cache_readonly=True)
r = self.yt_music_connection.get("https://music.youtube.com/")
if r is None:
return
@ -232,7 +232,7 @@ class YoutubeMusic(SuperYouTube):
'set_ytc': 'true',
'set_apyt': 'true',
'set_eom': 'false'
}, disable_cache=True)
})
if r is None:
return
@ -247,9 +247,9 @@ class YoutubeMusic(SuperYouTube):
# save cookies in settings
youtube_settings["youtube_music_consent_cookies"] = cookie_dict
else:
self.yt_music_connection.save(r, "youtube_music_index.html", no_update_if_valid_exists=True)
self.yt_music_connection.save(r, "index.html")
r = self.yt_music_connection.get("https://music.youtube.com/", name="youtube_music_index.html")
r = self.yt_music_connection.get("https://music.youtube.com/", name="index.html")
if r is None:
return
@ -336,10 +336,10 @@ class YoutubeMusic(SuperYouTube):
default='{}'
)) or {}
def get_source_type(self, source: Source) -> Optional[Type[DataObject]]:
def get_source_type(self, source: Source) -> Optional[Type[DatabaseObject]]:
return super().get_source_type(source)
def general_search(self, search_query: str) -> List[DataObject]:
def general_search(self, search_query: str) -> List[DatabaseObject]:
search_query = search_query.strip()
urlescaped_query: str = quote(search_query.strip().replace(" ", "+"))
@ -374,8 +374,7 @@ class YoutubeMusic(SuperYouTube):
},
headers={
"Referer": get_youtube_url(path=f"/search", query=f"q={urlescaped_query}")
},
name=f"search_{search_query}.json"
}
)
if r is None:
@ -401,7 +400,7 @@ class YoutubeMusic(SuperYouTube):
return results
def fetch_artist(self, source: Source, stop_at_level: int = 1) -> Artist:
artist = Artist(source_list=[source])
artist = Artist()
# construct the request
url = urlparse(source.url)
@ -412,8 +411,7 @@ class YoutubeMusic(SuperYouTube):
json={
"browseId": browse_id,
"context": {**self.credentials.context, "adSignalsInfo": {"params": []}}
},
name=f"fetch_artist_{browse_id}.json"
}
)
if r is None:
return artist
@ -421,28 +419,9 @@ class YoutubeMusic(SuperYouTube):
if DEBUG:
dump_to_file(f"{browse_id}.json", r.text, is_json=True, exit_after_dump=False)
# artist details
data: dict = r.json()
header = data.get("header", {})
musicDetailHeaderRenderer = header.get("musicDetailHeaderRenderer", {})
musicImmersiveHeaderRenderer = header.get("musicImmersiveHeaderRenderer", {})
title_runs: List[dict] = musicDetailHeaderRenderer.get("title", {}).get("runs", [])
subtitle_runs: List[dict] = musicDetailHeaderRenderer.get("subtitle", {}).get("runs", [])
if len(title_runs) > 0:
artist.name = title_runs[0].get("text", artist.name)
# fetch discography
renderer_list = r.json().get("contents", {}).get("singleColumnBrowseResultsRenderer", {}).get("tabs", [{}])[
0].get("tabRenderer", {}).get("content", {}).get("sectionListRenderer", {}).get("contents", [])
# fetch artist artwork
artist_thumbnails = musicImmersiveHeaderRenderer.get("thumbnail", {}).get("musicThumbnailRenderer", {}).get("thumbnail", {}).get("thumbnails", {})
for artist_thumbnail in artist_thumbnails:
artist.artwork.append(artist_thumbnail)
if DEBUG:
for i, content in enumerate(renderer_list):
dump_to_file(f"{i}-artists-renderer.json", json.dumps(content), is_json=True, exit_after_dump=False)
@ -475,8 +454,7 @@ class YoutubeMusic(SuperYouTube):
json={
"browseId": browse_id,
"context": {**self.credentials.context, "adSignalsInfo": {"params": []}}
},
name=f"fetch_album_{browse_id}.json"
}
)
if r is None:
return album
@ -484,51 +462,6 @@ class YoutubeMusic(SuperYouTube):
if DEBUG:
dump_to_file(f"{browse_id}.json", r.text, is_json=True, exit_after_dump=False)
data = r.json()
# album details
header = data.get("header", {})
musicDetailHeaderRenderer = header.get("musicDetailHeaderRenderer", {})
# album artwork
album_thumbnails = musicDetailHeaderRenderer.get("thumbnail", {}).get("croppedSquareThumbnailRenderer", {}).get("thumbnail", {}).get("thumbnails", {})
for album_thumbnail in album_thumbnails:
album.artwork.append(value=album_thumbnail)
title_runs: List[dict] = musicDetailHeaderRenderer.get("title", {}).get("runs", [])
subtitle_runs: List[dict] = musicDetailHeaderRenderer.get("subtitle", {}).get("runs", [])
if len(title_runs) > 0:
album.title = title_runs[0].get("text", album.title)
def other_parse_run(run: dict) -> str:
nonlocal album
if "text" not in run:
return
text = run["text"]
is_text_field = len(run.keys()) == 1
# regex that text is a year
if is_text_field and re.match(r"\d{4}", text):
album.date = ID3Timestamp.strptime(text, "%Y")
return
if text in ALBUM_TYPE_MAP:
album.album_type = ALBUM_TYPE_MAP[text]
return
if not is_text_field:
r = parse_run_element(run)
if r is not None:
album.add_list_of_other_objects([r])
return
for _run in subtitle_runs:
other_parse_run(_run)
# tracklist
renderer_list = r.json().get("contents", {}).get("singleColumnBrowseResultsRenderer", {}).get("tabs", [{}])[
0].get("tabRenderer", {}).get("content", {}).get("sectionListRenderer", {}).get("contents", [])
@ -536,75 +469,20 @@ class YoutubeMusic(SuperYouTube):
for i, content in enumerate(renderer_list):
dump_to_file(f"{i}-album-renderer.json", json.dumps(content), is_json=True, exit_after_dump=False)
results = []
"""
cant use fixed indices, because if something has no entries, the list dissappears
instead I have to try parse everything, and just reject community playlists and profiles.
"""
for renderer in renderer_list:
album.add_list_of_other_objects(parse_renderer(renderer))
results.extend(parse_renderer(renderer))
for song in album.song_collection:
for song_source in song.source_collection:
song_source.additional_data["playlist_id"] = browse_id
album.add_list_of_other_objects(results)
return album
def fetch_lyrics(self, video_id: str, playlist_id: str = None) -> str:
"""
1. fetches the tabs of a song, to get the browse id
2. finds the browse id of the lyrics
3. fetches the lyrics with the browse id
"""
request_data = {
"context": {**self.credentials.context, "adSignalsInfo": {"params": []}},
"videoId": video_id,
}
if playlist_id is not None:
request_data["playlistId"] = playlist_id
tab_request = self.yt_music_connection.post(
url=get_youtube_url(path="/youtubei/v1/next", query=f"prettyPrint=false"),
json=request_data,
name=f"fetch_song_tabs_{video_id}.json",
)
if tab_request is None:
return None
dump_to_file(f"fetch_song_tabs_{video_id}.json", tab_request.text, is_json=True, exit_after_dump=False)
tab_data: dict = tab_request.json()
tabs = traverse_json_path(tab_data, "contents.singleColumnMusicWatchNextResultsRenderer.tabbedRenderer.watchNextTabbedResultsRenderer.tabs", default=[])
browse_id = None
for tab in tabs:
pageType = traverse_json_path(tab, "tabRenderer.endpoint.browseEndpoint.browseEndpointContextSupportedConfigs.browseEndpointContextMusicConfig.pageType", default="")
if pageType in ("MUSIC_TAB_TYPE_LYRICS", "MUSIC_PAGE_TYPE_TRACK_LYRICS") or "lyrics" in pageType.lower():
browse_id = traverse_json_path(tab, "tabRenderer.endpoint.browseEndpoint.browseId", default=None)
if browse_id is not None:
break
if browse_id is None:
return None
r = self.yt_music_connection.post(
url=get_youtube_url(path="/youtubei/v1/browse", query=f"prettyPrint=false"),
json={
"browseId": browse_id,
"context": {**self.credentials.context, "adSignalsInfo": {"params": []}}
},
name=f"fetch_song_lyrics_{video_id}.json"
)
if r is None:
return None
dump_to_file(f"fetch_song_lyrics_{video_id}.json", r.text, is_json=True, exit_after_dump=False)
data = r.json()
lyrics_text = traverse_json_path(data, "contents.sectionListRenderer.contents[0].musicDescriptionShelfRenderer.description.runs[0].text", default=None)
if lyrics_text is None:
return None
return Lyrics(FormattedText(plain=lyrics_text))
def fetch_song(self, source: Source, stop_at_level: int = 1) -> Song:
ydl_res: dict = {}
@ -617,19 +495,7 @@ class YoutubeMusic(SuperYouTube):
self.fetch_media_url(source=source, ydl_res=ydl_res)
artist_names = []
uploader = ydl_res.get("uploader", "")
if uploader.endswith(" - Topic"):
artist_names = [uploader.rstrip(" - Topic")]
artist_list = [
Artist(
name=name,
source_list=[Source(
self.SOURCE_TYPE,
f"https://music.youtube.com/channel/{ydl_res.get('channel_id', ydl_res.get('uploader_id', ''))}"
)]
) for name in artist_names]
artist_name = ydl_res.get("artist", ydl_res.get("uploader", "")).rstrip(" - Topic")
album_list = []
if "album" in ydl_res:
@ -638,57 +504,25 @@ class YoutubeMusic(SuperYouTube):
date=ID3Timestamp.strptime(ydl_res.get("upload_date"), "%Y%m%d"),
))
artist_name = artist_names[0] if len(artist_names) > 0 else None
song = Song(
return Song(
title=ydl_res.get("track", clean_song_title(ydl_res.get("title"), artist_name=artist_name)),
note=ydl_res.get("descriptions"),
album_list=album_list,
length=int(ydl_res.get("duration", 0)) * 1000,
artwork=ArtworkCollection(*ydl_res.get("thumbnails", [])),
artist_list=artist_list,
artwork=Artwork(*ydl_res.get("thumbnails", [])),
main_artist_list=[Artist(
name=artist_name,
source_list=[Source(
self.SOURCE_TYPE,
SourcePages.YOUTUBE_MUSIC,
f"https://music.youtube.com/channel/{ydl_res.get('channel_id', ydl_res.get('uploader_id', ''))}"
)]
)],
source_list=[Source(
SourcePages.YOUTUBE_MUSIC,
f"https://music.youtube.com/watch?v={ydl_res.get('id')}"
), source],
)
# other song details
parsed_url = urlparse(source.url)
browse_id = parse_qs(parsed_url.query)['v'][0]
request_data = {
"captionParams": {},
"context": {**self.credentials.context, "adSignalsInfo": {"params": []}},
"videoId": browse_id,
}
if "playlist_id" in source.additional_data:
request_data["playlistId"] = source.additional_data["playlist_id"]
initial_details = self.yt_music_connection.post(
url=get_youtube_url(path="/youtubei/v1/player", query=f"prettyPrint=false"),
json=request_data,
name=f"fetch_song_{browse_id}.json",
)
if initial_details is None:
return song
dump_to_file(f"fetch_song_{browse_id}.json", initial_details.text, is_json=True, exit_after_dump=False)
data = initial_details.json()
video_details = data.get("videoDetails", {})
browse_id = video_details.get("videoId", browse_id)
song.title = video_details.get("title", song.title)
if video_details.get("isLiveContent", False):
for album in song.album_list:
album.album_type = AlbumType.LIVE_ALBUM
for thumbnail in video_details.get("thumbnails", []):
song.artwork.add_data(**thumbnail)
song.lyrics_collection.append(self.fetch_lyrics(browse_id, playlist_id=request_data.get("playlistId")))
return song
def fetch_media_url(self, source: Source, ydl_res: dict = None) -> dict:
def _get_best_format(format_list: List[Dict]) -> dict:
@ -715,16 +549,12 @@ class YoutubeMusic(SuperYouTube):
return self.download_values_by_url[source.url]
if ydl_res is None:
try:
ydl_res = self.ydl.extract_info(url=source.url, download=False)
except DownloadError as e:
self.not_download[source.hash_url] = e
self.LOGGER.error(f"Couldn't fetch song from {source.url}. {e}")
return {"error": e}
_best_format = _get_best_format(ydl_res.get("formats", [{}]))
self.download_values_by_url[source.url] = {
"url": _best_format.get("url"),
"chunk_size": _best_format.get("downloader_options", {}).get("http_chunk_size", main_settings["chunk_size"]),
"headers": _best_format.get("http_headers", {}),
}
@ -734,7 +564,7 @@ class YoutubeMusic(SuperYouTube):
def download_song_to_target(self, source: Source, target: Target, desc: str = None) -> DownloadResult:
media = self.fetch_media_url(source)
if source.hash_url not in self.not_download and "error" not in media:
if source.hash_url not in self.not_download:
result = self.download_connection.stream_into(
media["url"],
target,
@ -743,12 +573,11 @@ class YoutubeMusic(SuperYouTube):
raw_headers=True,
disable_cache=True,
headers=media.get("headers", {}),
chunk_size=main_settings["chunk_size"],
# chunk_size=media.get("chunk_size", main_settings["chunk_size"]),
method="GET",
timeout=5,
)
else:
result = DownloadResult(error_message=str(media.get("error") or self.not_download[source.hash_url]))
result = DownloadResult(error_message=str(self.not_download[source.hash_url]))
if result.is_fatal_error:
result.merge(super().download_song_to_target(source=source, target=target, desc=desc))

View File

@ -1,40 +1,26 @@
import inspect
from datetime import datetime
from pathlib import Path
import json
import logging
from datetime import datetime
from functools import lru_cache
from pathlib import Path
from typing import Any, List, Union
import inspect
from .shared import DEBUG, DEBUG_LOGGING, DEBUG_DUMP, DEBUG_TRACE, DEBUG_OBJECT_TRACE, DEBUG_OBJECT_TRACE_CALLSTACK
from .config import config, read_config, write_config
from .enums.colors import BColors
from .hacking import merge_args
from .path_manager import LOCATIONS
from .shared import (DEBUG, DEBUG_DUMP, DEBUG_LOGGING, DEBUG_OBJECT_TRACE,
DEBUG_OBJECT_TRACE_CALLSTACK, DEBUG_TRACE, URL_PATTERN)
from .string_processing import hash_url, is_url, unify
"""
IO functions
"""
def _apply_color(msg: str, color: BColors) -> str:
if not isinstance(msg, str):
msg = str(msg)
endc = BColors.ENDC.value
if color is BColors.ENDC:
return msg
msg = msg.replace(BColors.ENDC.value, BColors.ENDC.value + color.value)
return color.value + msg + BColors.ENDC.value
@merge_args(print)
def output(*msg: List[str], color: BColors = BColors.ENDC, **kwargs):
print(*(_apply_color(s, color) for s in msg), **kwargs)
def output(msg: str, color: BColors = BColors.ENDC):
print(_apply_color(msg, color))
def user_input(msg: str, color: BColors = BColors.ENDC):
@ -65,63 +51,20 @@ def trace(msg: str):
if not DEBUG_TRACE:
return
output(BColors.OKBLUE.value + "trace: " + BColors.ENDC.value + msg)
def request_trace(msg: str):
if not DEBUG_TRACE:
return
output(BColors.OKGREEN.value + "request: " + BColors.ENDC.value + msg)
output("trace: " + msg, BColors.OKBLUE)
def object_trace(obj):
if not DEBUG_OBJECT_TRACE:
return
appendix = f" called by [{' | '.join(f'{s.function} {Path(s.filename).name}:{str(s.lineno)}' for s in inspect.stack()[1:5])}]" if DEBUG_OBJECT_TRACE_CALLSTACK else ""
output("object: " + str(obj) + appendix)
output("object: " + str(obj) + appendix, BColors.GREY)
"""
misc functions
"""
def traverse_json_path(data, path: Union[str, List[str]], default=None):
"""
Path parts are concatenated with . or wrapped with [""] for object keys and wrapped in [] for array indices.
"""
if isinstance(path, str):
path = path.replace('["', '.').replace('"]', '.').replace("[", ".").replace("]", ".")
path = [p for p in path.split(".") if len(p) > 0]
if len(path) <= 0:
return data
current = path[0]
path = path[1:]
new_data = None
if isinstance(data, dict):
new_data = data.get(current)
elif isinstance(data, list):
try:
new_data = data[int(current)]
except (IndexError, ValueError):
pass
if new_data is None:
return default
return traverse_json_path(data=new_data, path=path, default=default)
_auto_increment = 0
def generate_id() -> int:
global _auto_increment
_auto_increment += 1
return _auto_increment
def get_current_millis() -> int:
dt = datetime.now()
return int(dt.microsecond / 1_000)
@ -129,34 +72,3 @@ def get_current_millis() -> int:
def get_unix_time() -> int:
return int(datetime.now().timestamp())
@lru_cache
def custom_hash(value: Any) -> int:
if is_url(value):
value = hash_url(value)
elif isinstance(value, str):
try:
value = int(value)
except ValueError:
value = unify(value)
return hash(value)
def create_dataclass_instance(t, data: dict):
"""Creates an instance of a dataclass with the given data.
It filters out all data key, which has no attribute in the dataclass.
Args:
t (Type): The dataclass type class
data (dict): the attribute to pass into the constructor
Returns:
Tuple[Type, dict]: The created instance and a dict, containing the data, which was not used in the creation
"""
needed_data = {k: v for k, v in data.items() if k in t.__dataclass_fields__}
removed_data = {k: v for k, v in data.items() if k not in t.__dataclass_fields__}
return t(**needed_data), removed_data

View File

@ -1,8 +1,11 @@
from typing import Tuple
from .config import Config
from .config_files import main_config, logging_config, youtube_config
from .config_files import (
main_config,
logging_config,
youtube_config,
)
_sections: Tuple[Config, ...] = (
main_config.config,

View File

@ -18,9 +18,8 @@ config = Config((
AudioFormatAttribute(name="audio_format", default_value="mp3", description="""Music Kraken will stream the audio into this format.
You can use Audio formats which support ID3.2 and ID3.1,
but you will have cleaner Metadata using ID3.2."""),
Attribute(name="image_format", default_value="jpeg", description="This Changes the format in which images are getting downloaded"),
Attribute(name="result_history", default_value=True, description="""If enabled, you can go back to the previous results.
Attribute(name="result_history", default_value=False, description="""If enabled, you can go back to the previous results.
The consequence is a higher meory consumption, because every result is saved."""),
Attribute(name="history_length", default_value=8, description="""You can choose how far back you can go in the result history.
The further you choose to be able to go back, the higher the memory usage.
@ -29,7 +28,6 @@ The further you choose to be able to go back, the higher the memory usage.
EmptyLine(),
Attribute(name="preferred_artwork_resolution", default_value=1000),
Attribute(name="download_artist_artworks", default_value=True, description="Enables the fetching of artist galleries."),
EmptyLine(),
@ -46,7 +44,6 @@ This means for example, the Studio Albums and EP's are always in front of Single
- album_type
The folder music kraken should put the songs into."""),
Attribute(name="download_file", default_value="{song}.{audio_format}", description="The filename of the audio file."),
Attribute(name="artist_artwork_path", default_value="{genre}/{artist}/{artist}_{image_number}.{image_format}", description="The Path to download artist images to."),
SelectAttribute(name="album_type_blacklist", default_value=[
"Compilation Album",
"Live Album",
@ -155,13 +152,10 @@ class SettingsStructure(TypedDict):
# artwork
preferred_artwork_resolution: int
image_format: str
download_artist_artworks: bool
# paths
music_directory: Path
temp_directory: Path
artist_artwork_path: Path
log_file: Path
not_a_genre_regex: List[str]
ffmpeg_binary: Path

View File

@ -1,128 +1 @@
from __future__ import annotations
from dataclasses import dataclass
from enum import Enum
from typing import TYPE_CHECKING, Optional, Type
from mutagen.id3 import PictureType
if TYPE_CHECKING:
from ...pages.abstract import Page
@dataclass
class SourceType:
name: str
homepage: Optional[str] = None
download_priority: int = 0
page_type: Type[Page] = None
page: Page = None
def register_page(self, page: Page):
self.page = page
def __hash__(self):
return hash(self.name)
@property
def has_page(self) -> bool:
return self.page is not None
# for backwards compatibility
@property
def value(self) -> str:
return self.name
class ALL_SOURCE_TYPES:
YOUTUBE = SourceType(name="youtube", homepage="https://music.youtube.com/")
BANDCAMP = SourceType(name="bandcamp", homepage="https://bandcamp.com/", download_priority=10)
MUSIFY = SourceType(name="musify", homepage="https://musify.club/", download_priority=7)
GENIUS = SourceType(name="genius", homepage="https://genius.com/")
MUSICBRAINZ = SourceType(name="musicbrainz", homepage="https://musicbrainz.org/")
ENCYCLOPAEDIA_METALLUM = SourceType(name="encyclopaedia metallum")
DEEZER = SourceType(name="deezer", homepage="https://www.deezer.com/")
SPOTIFY = SourceType(name="spotify", homepage="https://open.spotify.com/")
# This has nothing to do with audio, but bands can be here
WIKIPEDIA = SourceType(name="wikipedia", homepage="https://en.wikipedia.org/wiki/Main_Page")
INSTAGRAM = SourceType(name="instagram", homepage="https://www.instagram.com/")
FACEBOOK = SourceType(name="facebook", homepage="https://www.facebook.com/")
TWITTER = SourceType(name="twitter", homepage="https://twitter.com/")
# Yes somehow this ancient site is linked EVERYWHERE
MYSPACE = SourceType(name="myspace", homepage="https://myspace.com/")
MANUAL = SourceType(name="manual")
PRESET = SourceType(name="preset")
class PictureType(Enum):
"""Enumeration of image types defined by the ID3 standard for the APIC
frame, but also reused in WMA/FLAC/VorbisComment.
This is copied from mutagen.id3.PictureType
"""
OTHER = 0
FILE_ICON = 1
"""32x32 pixels 'file icon' (PNG only)"""
OTHER_FILE_ICON = 2
"""Other file icon"""
COVER_FRONT = 3
"""Cover (front)"""
COVER_BACK = 4
"""Cover (back)"""
LEAFLET_PAGE = 5
"""Leaflet page"""
MEDIA = 6
"""Media (e.g. label side of CD)"""
LEAD_ARTIST = 7
"""Lead artist/lead performer/soloist"""
ARTIST = 8
"""Artist/performer"""
CONDUCTOR = 9
"""Conductor"""
BAND = 10
"""Band/Orchestra"""
COMPOSER = 11
"""Composer"""
LYRICIST = 12
"""Lyricist/text writer"""
RECORDING_LOCATION = 13
"""Recording Location"""
DURING_RECORDING = 14
"""During recording"""
DURING_PERFORMANCE = 15
"""During performance"""
SCREEN_CAPTURE = 16
"""Movie/video screen capture"""
FISH = 17
"""A bright colored fish"""
ILLUSTRATION = 18
"""Illustration"""
BAND_LOGOTYPE = 19
"""Band/artist logotype"""
PUBLISHER_LOGOTYPE = 20
"""Publisher/Studio logotype"""
from .source import SourcePages

View File

@ -0,0 +1,50 @@
from enum import Enum
class SourceTypes(Enum):
SONG = "song"
ALBUM = "album"
ARTIST = "artist"
LYRICS = "lyrics"
class SourcePages(Enum):
YOUTUBE = "youtube"
MUSIFY = "musify"
YOUTUBE_MUSIC = "youtube music"
GENIUS = "genius"
MUSICBRAINZ = "musicbrainz"
ENCYCLOPAEDIA_METALLUM = "encyclopaedia metallum"
BANDCAMP = "bandcamp"
DEEZER = "deezer"
SPOTIFY = "spotify"
# This has nothing to do with audio, but bands can be here
WIKIPEDIA = "wikipedia"
INSTAGRAM = "instagram"
FACEBOOK = "facebook"
TWITTER = "twitter" # I will use nitter though lol
MYSPACE = "myspace" # Yes somehow this ancient site is linked EVERYWHERE
MANUAL = "manual"
PRESET = "preset"
@classmethod
def get_homepage(cls, attribute) -> str:
homepage_map = {
cls.YOUTUBE: "https://www.youtube.com/",
cls.MUSIFY: "https://musify.club/",
cls.MUSICBRAINZ: "https://musicbrainz.org/",
cls.ENCYCLOPAEDIA_METALLUM: "https://www.metal-archives.com/",
cls.GENIUS: "https://genius.com/",
cls.BANDCAMP: "https://bandcamp.com/",
cls.DEEZER: "https://www.deezer.com/",
cls.INSTAGRAM: "https://www.instagram.com/",
cls.FACEBOOK: "https://www.facebook.com/",
cls.SPOTIFY: "https://open.spotify.com/",
cls.TWITTER: "https://twitter.com/",
cls.MYSPACE: "https://myspace.com/",
cls.WIKIPEDIA: "https://en.wikipedia.org/wiki/Main_Page"
}
return homepage_map[attribute]

View File

@ -1,23 +1 @@
class MKBaseException(Exception):
def __init__(self, message: str = None, **kwargs) -> None:
self.message = message
super().__init__(message, **kwargs)
# Downloading
class MKDownloadException(MKBaseException):
pass
class MKMissingNameException(MKDownloadException):
pass
# Frontend
class MKFrontendException(MKBaseException):
pass
class MKInvalidInputException(MKFrontendException):
pass
__all__ = ["config"]

View File

@ -78,14 +78,7 @@ def _merge(
drop_args = []
if drop_kwonlyargs is None:
drop_kwonlyargs = []
is_builtin = False
try:
source_spec = inspect.getfullargspec(source)
except TypeError:
is_builtin = True
source_spec = inspect.FullArgSpec(type(source).__name__, [], [], [], [], [], [])
dest_spec = inspect.getfullargspec(dest)
if source_spec.varargs or source_spec.varkw:
@ -135,15 +128,13 @@ def _merge(
'co_kwonlyargcount': len(kwonlyargs_merged),
'co_posonlyargcount': dest.__code__.co_posonlyargcount,
'co_nlocals': len(args_all),
'co_flags': source.__code__.co_flags,
'co_varnames': args_all,
'co_filename': dest.__code__.co_filename,
'co_name': dest.__code__.co_name,
'co_firstlineno': dest.__code__.co_firstlineno,
}
if hasattr(source, "__code__"):
replace_kwargs['co_flags'] = source.__code__.co_flags
if PY310:
replace_kwargs['co_linetable'] = dest.__code__.co_linetable
else:
@ -160,7 +151,7 @@ def _merge(
len(kwonlyargs_merged),
_blank.__code__.co_nlocals,
_blank.__code__.co_stacksize,
source.__code__.co_flags if hasattr(source, "__code__") else dest.__code__.co_flags,
source.__code__.co_flags,
_blank.__code__.co_code, (), (),
args_all, dest.__code__.co_filename,
dest.__code__.co_name,
@ -180,9 +171,6 @@ def _merge(
dest_ret = dest.__annotations__['return']
for v in ('__kwdefaults__', '__annotations__'):
if not hasattr(source, v):
continue
out = getattr(source, v)
if out is None:
out = {}

View File

@ -19,8 +19,7 @@ DEBUG_OBJECT_TRACE = DEBUG and False
DEBUG_OBJECT_TRACE_CALLSTACK = DEBUG_OBJECT_TRACE and False
DEBUG_YOUTUBE_INITIALIZING = DEBUG and False
DEBUG_PAGES = DEBUG and False
DEBUG_DUMP = DEBUG and True
DEBUG_PRINT_ID = DEBUG and True
DEBUG_DUMP = DEBUG and False
if DEBUG:
print("DEBUG ACTIVE")

View File

@ -1,15 +1,12 @@
import re
from typing import Tuple, Union, Optional
from pathlib import Path
import string
from functools import lru_cache
from pathlib import Path
from typing import Any, Optional, Tuple, Union
from urllib.parse import ParseResult, parse_qs, urlparse
from pathvalidate import sanitize_filename
from transliterate import translit
from transliterate.exceptions import LanguageDetectionError
from transliterate import translit
from pathvalidate import sanitize_filename
from .shared import URL_PATTERN
COMMON_TITLE_APPENDIX_LIST: Tuple[str, ...] = (
"(official video)",
@ -24,7 +21,6 @@ def unify(string: str) -> str:
returns a unified str, to make comparisons easy.
a unified string has the following attributes:
- is lowercase
- is transliterated to Latin characters from e.g. Cyrillic
"""
if string is None:
@ -35,33 +31,23 @@ def unify(string: str) -> str:
except LanguageDetectionError:
pass
string = unify_punctuation(string)
return string.lower().strip()
return string.lower()
def fit_to_file_system(string: Union[str, Path], hidden_ok: bool = False) -> Union[str, Path]:
def fit_to_file_system(string: Union[str, Path]) -> Union[str, Path]:
def fit_string(string: str) -> str:
nonlocal hidden_ok
if string == "/":
return "/"
string = string.strip()
while string[0] == "." and not hidden_ok:
while string[0] == ".":
if len(string) == 0:
return string
string = string[1:]
string = string.replace("/", "_").replace("\\", "_")
try:
string = translit(string, reversed=True)
except LanguageDetectionError:
pass
string = sanitize_filename(string)
return string
if isinstance(string, Path):
@ -107,7 +93,7 @@ def clean_song_title(raw_song_title: str, artist_name: Optional[str] = None) ->
break
substring = raw_song_title[open_bracket_index + 1:close_bracket_index]
if any(disallowed_substring in substring.lower() for disallowed_substring in DISALLOWED_SUBSTRING_IN_BRACKETS):
if any(disallowed_substring in substring for disallowed_substring in DISALLOWED_SUBSTRING_IN_BRACKETS):
raw_song_title = raw_song_title[:open_bracket_index] + raw_song_title[close_bracket_index + 1:]
else:
start = close_bracket_index + 1
@ -118,13 +104,10 @@ def clean_song_title(raw_song_title: str, artist_name: Optional[str] = None) ->
# Remove artist from the start of the title
if raw_song_title.lower().startswith(artist_name.lower()):
raw_song_title = raw_song_title[len(artist_name):].strip()
possible_new_name = raw_song_title[len(artist_name):].strip()
for char in ("-", "", ":", "|"):
if possible_new_name.startswith(char):
raw_song_title = possible_new_name[1:].strip()
break
if raw_song_title.startswith("-"):
raw_song_title = raw_song_title[1:].strip()
return raw_song_title.strip()
@ -142,45 +125,13 @@ UNIFY_TO = " "
ALLOWED_LENGTH_DISTANCE = 20
def unify_punctuation(to_unify: str, unify_to: str = UNIFY_TO) -> str:
def unify_punctuation(to_unify: str) -> str:
for char in string.punctuation:
to_unify = to_unify.replace(char, unify_to)
to_unify = to_unify.replace(char, UNIFY_TO)
return to_unify
@lru_cache(maxsize=128)
def hash_url(url: Union[str, ParseResult]) -> str:
if isinstance(url, str):
url = urlparse(url)
unify_to = "-"
def unify_part(part: str) -> str:
nonlocal unify_to
return unify_punctuation(part.lower(), unify_to=unify_to).strip(unify_to)
# netloc
netloc = unify_part(url.netloc)
if netloc.startswith("www" + unify_to):
netloc = netloc[3 + len(unify_to):]
# query
query = url.query
query_dict: Optional[dict] = None
try:
query_dict: dict = parse_qs(url.query, strict_parsing=True)
except ValueError:
# the query couldn't be parsed
pass
if isinstance(query_dict, dict):
# sort keys alphabetically
query = ""
for key, value in sorted(query_dict.items(), key=lambda i: i[0]):
query += f"{key.strip()}-{''.join(i.strip() for i in value)}"
r = f"{netloc}_{unify_part(url.path)}_{unify_part(query)}"
r = r.lower().strip()
return r
def hash_url(url: str) -> int:
return url.strip().lower().lstrip("https://").lstrip("http://")
def remove_feature_part_from_track(title: str) -> str:
@ -226,18 +177,3 @@ def match_length(length_1: int | None, length_2: int | None) -> bool:
return True
return abs(length_1 - length_2) <= ALLOWED_LENGTH_DISTANCE
def shorten_display_url(url: str, max_length: int = 150, chars_at_end: int = 4, shorten_string: str = "[...]") -> str:
if len(url) <= max_length + chars_at_end + len(shorten_string):
return url
return url[:max_length] + shorten_string + url[-chars_at_end:]
def is_url(value: Any) -> bool:
if isinstance(value, ParseResult):
return True
if not isinstance(value, str):
return True
# value has to be a string
return re.match(URL_PATTERN, value) is not None

View File

@ -1,13 +1,9 @@
from __future__ import annotations
from dataclasses import dataclass, field
from typing import TYPE_CHECKING, List, Tuple
from typing import List, Tuple
if TYPE_CHECKING:
from ...objects import Target
from ...utils.config import logging_settings, main_settings
from ...utils.config import main_settings, logging_settings
from ...utils.enums.colors import BColors
from ...objects import Target
UNIT_PREFIXES: List[str] = ["", "k", "m", "g", "t"]
UNIT_DIVISOR = 1024

View File

@ -24,7 +24,7 @@ class Query:
return [self.music_object.name]
if isinstance(self.music_object, Song):
return [f"{artist.name} - {self.music_object}" for artist in self.music_object.artist_collection]
return [f"{artist.name} - {self.music_object}" for artist in self.music_object.main_artist_collection]
if isinstance(self.music_object, Album):
return [f"{artist.name} - {self.music_object}" for artist in self.music_object.artist_collection]

View File

@ -69,7 +69,7 @@ dependencies = [
"toml~=0.10.2",
"typing_extensions~=4.7.1",
"python-sponsorblock~=0.1",
"sponsorblock~=0.1.3",
"youtube_dl",
]
dynamic = [

25
requirements.txt Normal file
View File

@ -0,0 +1,25 @@
requests~=2.31.0
mutagen~=1.46.0
musicbrainzngs~=0.7.1
jellyfish~=0.9.0
beautifulsoup4~=4.11.1
pycountry~=24.0.1
python-dateutil~=2.8.2
pandoc~=2.3
SQLAlchemy~=2.0.7
setuptools~=68.2.0
tqdm~=4.65.0
ffmpeg-python~=0.2.0
platformdirs~=4.2.0
transliterate~=1.10.2
sponsorblock~=0.1.3
regex~=2022.9.13
pyffmpeg~=2.4.2.18
ffmpeg-progress-yield~=0.7.8
pathvalidate~=2.5.2
guppy3~=3.1.3
toml~=0.10.2
typing_extensions~=4.7.1
responses~=0.24.1
youtube_dl
merge_args~=0.1.5

View File

View File

@ -3,76 +3,78 @@ import unittest
from music_kraken.objects import Song, Album, Artist, Collection, Country
class TestCollection(unittest.TestCase):
def test_song_contains_album(self):
"""
Tests that every song contains the album it is added to in its album_collection
"""
a_1 = Album(
title="album",
song_list= [
Song(title="song"),
]
)
a_2 = a_1.song_collection[0].album_collection[0]
self.assertTrue(a_1.id == a_2.id)
def test_album_contains_song(self):
"""
Tests that every album contains the song it is added to in its song_collection
"""
s_1 = Song(
title="song",
album_list=[
Album(title="album"),
]
)
s_2 = s_1.album_collection[0].song_collection[0]
self.assertTrue(s_1.id == s_2.id)
def test_auto_add_artist_to_album_feature_artist(self):
"""
Tests that every artist is added to the album's feature_artist_collection per default
"""
a_1 = Artist(
@staticmethod
def complicated_object() -> Artist:
return Artist(
name="artist",
album_list=[
Album(title="album")
]
)
a_2 = a_1.album_collection[0].feature_artist_collection[0]
self.assertTrue(a_1.id == a_2.id)
def test_auto_add_artist_to_album_feature_artist_push(self):
"""
Tests that every artist is added to the album's feature_artist_collection per default but pulled into the album's artist_collection if a merge exitst
"""
a_1 = Artist(
name="artist",
album_list=[
country=Country.by_alpha_2("DE"),
main_album_list=[
Album(
title="album",
artist_list=[
Artist(name="artist"),
song_list=[
Song(
title="song",
album_list=[
Album(title="album", albumsort=123),
],
),
Song(
title="other_song",
album_list=[
Album(title="album", albumsort=423),
],
),
]
),
Album(title="album", barcode="1234567890123"),
]
)
]
)
a_2 = a_1.album_collection[0].artist_collection[0]
self.assertTrue(a_1.id == a_2.id)
def test_song_album_relation(self):
"""
Tests that
album = album.any_song.one_album
is the same object
"""
a = self.complicated_object().main_album_collection[0]
b = a.song_collection[0].album_collection[0]
c = a.song_collection[1].album_collection[0]
d = b.song_collection[0].album_collection[0]
e = d.song_collection[0].album_collection[0]
f = e.song_collection[0].album_collection[0]
g = f.song_collection[0].album_collection[0]
self.assertTrue(a.id == b.id == c.id == d.id == e.id == f.id == g.id)
self.assertTrue(a.title == b.title == c.title == d.title == e.title == f.title == g.title == "album")
self.assertTrue(a.barcode == b.barcode == c.barcode == d.barcode == e.barcode == f.barcode == g.barcode == "1234567890123")
self.assertTrue(a.albumsort == b.albumsort == c.albumsort == d.albumsort == e.albumsort == f.albumsort == g.albumsort == 123)
d.title = "new_title"
self.assertTrue(a.title == b.title == c.title == d.title == e.title == f.title == g.title == "new_title")
def test_album_artist_relation(self):
"""
Tests that
artist = artist.any_album.any_song.one_artist
is the same object
"""
a = self.complicated_object()
b = a.main_album_collection[0].artist_collection[0]
c = b.main_album_collection[0].artist_collection[0]
d = c.main_album_collection[0].artist_collection[0]
self.assertTrue(a.id == b.id == c.id == d.id)
self.assertTrue(a.name == b.name == c.name == d.name == "artist")
self.assertTrue(a.country == b.country == c.country == d.country)
def test_artist_artist_relation(self):
"""
Tests the proper syncing between album.artist_collection and song.artist_collection
"""
album = Album(
artist = Artist(
name="artist",
main_album_list=[
Album(
title="album",
song_list=[
Song(title="song"),
@ -81,20 +83,16 @@ class TestCollection(unittest.TestCase):
Artist(name="artist"),
]
)
a_1 = album.artist_collection[0]
a_2 = album.song_collection[0].artist_collection[0]
]
)
self.assertTrue(a_1.id == a_2.id)
self.assertTrue(artist.id == artist.main_album_collection[0].song_collection[0].main_artist_collection[0].id)
def test_artist_collection_sync(self):
"""
tests the actual implementation of the test above
"""
album_1 = Album(
title="album",
song_list=[
Song(title="song", artist_list=[Artist(name="artist")]),
Song(title="song", main_artist_list=[Artist(name="artist")]),
],
artist_list=[
Artist(name="artist"),
@ -104,7 +102,7 @@ class TestCollection(unittest.TestCase):
album_2 = Album(
title="album",
song_list=[
Song(title="song", artist_list=[Artist(name="artist")]),
Song(title="song", main_artist_list=[Artist(name="artist")]),
],
artist_list=[
Artist(name="artist"),
@ -113,7 +111,17 @@ class TestCollection(unittest.TestCase):
album_1.merge(album_2)
self.assertTrue(id(album_1.artist_collection) == id(album_1.artist_collection) == id(album_1.song_collection[0].artist_collection) == id(album_1.song_collection[0].artist_collection))
self.assertTrue(id(album_1.artist_collection) == id(album_1.artist_collection) == id(album_1.song_collection[0].main_artist_collection) == id(album_1.song_collection[0].main_artist_collection))
def test_song_artist_relations(self):
a = self.complicated_object()
b = a.main_album_collection[0].song_collection[0].main_artist_collection[0]
c = b.main_album_collection[0].song_collection[0].main_artist_collection[0]
d = c.main_album_collection[0].song_collection[0].main_artist_collection[0]
self.assertTrue(a.id == b.id == c.id == d.id)
self.assertTrue(a.name == b.name == c.name == d.name == "artist")
self.assertTrue(a.country == b.country == c.country == d.country)
if __name__ == "__main__":
unittest.main()

View File

@ -1,35 +0,0 @@
import unittest
from music_kraken.utils.string_processing import hash_url
class TestCollection(unittest.TestCase):
def test_remove_schema(self):
self.assertFalse(hash_url("https://www.youtube.com/watch?v=3jZ_D3ELwOQ").startswith("https"))
self.assertFalse(hash_url("ftp://www.youtube.com/watch?v=3jZ_D3ELwOQ").startswith("https"))
self.assertFalse(hash_url("sftp://www.youtube.com/watch?v=3jZ_D3ELwOQ").startswith("https"))
self.assertFalse(hash_url("http://www.youtube.com/watch?v=3jZ_D3ELwOQ").startswith("https"))
def test_no_punctuation(self):
self.assertNotIn(hash_url("https://www.you_tube.com/watch?v=3jZ_D3ELwOQ"), "you_tube")
self.assertNotIn(hash_url("https://docs.gitea.com/next/install.ation/comparison"), ".")
def test_three_parts(self):
"""
The url is parsed into three parts [netloc; path; query]
Which are then appended to each other with an underscore between.
"""
self.assertTrue(hash_url("https://duckduckgo.com/?t=h_&q=dfasf&ia=web").count("_") == 2)
def test_sort_query(self):
"""
The query is sorted alphabetically
"""
hashed = hash_url("https://duckduckgo.com/?t=h_&q=dfasf&ia=web")
sorted_keys = ["ia-", "q-", "t-"]
self.assertTrue(hashed.index(sorted_keys[0]) < hashed.index(sorted_keys[1]) < hashed.index(sorted_keys[2]))
if __name__ == "__main__":
unittest.main()