feature/artwork_gallery #41

Merged
Hazel merged 25 commits from feature/artwork_gallery into experimental 2024-07-15 09:36:22 +00:00
Owner
No description provided.
Luna added 7 commits 2024-06-04 09:13:13 +00:00
Artwork gallery Musify
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
5d26fdbf94
feat: added hooks for collection on append
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
49c3734526
feat: added album.artwork to datastructure
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
1ef4b27f28
feat: musify completed
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
05ee09e25f
feat: structure changes to artwork and collection objects
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
d51e3a56fb
Owner

Please add the following settings:

image_format = "jpg"
download_artist_artworks = true
artist_artwork_path = "{genre}/{artist}/{artist}.{image_format}"
Please add the following settings: ``` image_format = "jpg" download_artist_artworks = true artist_artwork_path = "{genre}/{artist}/{artist}.{image_format}" ```
Luna added 1 commit 2024-06-04 09:44:56 +00:00
feat: config changes
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
d83e40ed83
Luna added 1 commit 2024-06-05 06:34:46 +00:00
feat: bandcamp artist artwork
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
7d23ecac06
Luna added 1 commit 2024-06-05 07:47:10 +00:00
feat: fix saving img in tmp
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
3118140f0f
Hazel added 1 commit 2024-06-05 10:05:46 +00:00
feat: renamed artwork
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
df98a70717
Luna added 2 commits 2024-06-05 11:49:57 +00:00
Hazel added 2 commits 2024-06-06 15:53:52 +00:00
Merge branch 'feature/artwork_gallery' of ssh://gitea.elara.ws:2222/music-kraken/music-kraken-core into feature/artwork_gallery
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
01dffc2443
Author
Owner

Is the Interface ready?

Is the Interface ready?
Hazel added 1 commit 2024-06-07 09:15:30 +00:00
feat: removed distracting code
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
eef3ea7f07
Hazel added 1 commit 2024-06-07 09:17:53 +00:00
feat: added extend
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
346d273201
Owner

Is the Interface ready?

@Luna not yet, but the code is pushed. I suggest you look at it and try implementing it yourself. Here is how the interface should work:

Classes

  • ArtworkCollection: All the different artworks for one data objects
  • Artwork: One artwork with multiple ArtworkVariants. This is necessary because maybe the same picture can be found with multiple different urls
  • ArtworkVariant: Contains the actual url of the artwork

Functions

The most important function for the scraping is ArtworkCollection.add_data, which adds the data passed into the function to existing entities or new one. It matches new ones by the url. Artwork.add_data does the same. This should be implemented.

For consistency ArtworkCollection.append has to exist even though it probaply won't be used much. There are 2 options how to go about it.

  • You can pass in a dict of data, an artworkvariant or an artwork
  • You can only pass in an artwork
    The second option would be cleaner imo but I want your opppinion.

ArtworkCollection.compile fetches every variant of artworks, and is called once the data-object is compiled (aka on download). There should the duplicate recognition be implemented


unfortunately I couldn't run the changes because of time so you will probaply will have to fix many small issues because of so big changes.

> Is the Interface ready? @Luna not yet, but the code is pushed. I suggest you look at it and try implementing it yourself. Here is how the interface should work: # Classes - `ArtworkCollection`: All the different artworks for one data objects - `Artwork`: One artwork with multiple `ArtworkVariants`. This is necessary because maybe the same picture can be found with multiple different urls - `ArtworkVariant`: Contains the actual url of the artwork # Functions The most important function for the scraping is `ArtworkCollection.add_data`, which adds the data passed into the function to existing entities or new one. It matches new ones by the url. `Artwork.add_data` does the same. This should be implemented. For consistency `ArtworkCollection.append` has to exist even though it probaply won't be used much. There are 2 options how to go about it. - You can pass in a dict of data, an artworkvariant or an artwork - You can only pass in an artwork The second option would be cleaner imo but I want your opppinion. `ArtworkCollection.compile` fetches every variant of artworks, and is called once the data-object is compiled (aka on download). There should the duplicate recognition be implemented --- unfortunately I couldn't run the changes because of time so you will probaply will have to fix many small issues because of so big changes.
Hazel added 1 commit 2024-06-07 09:28:01 +00:00
feat: added compile
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
2da7a48b72
Author
Owner

Perfect thanks I'll probably do that tonight

Perfect thanks I'll probably do that tonight
Owner

Duplicate Recognition of the images

This comment is only about removing duplicates

@Elara6331 said Google uses the simhash algorythm for their web crawler, and it doesn't seem to bad to implement.

@Luna found this code in a cli tool, which looks really nice:

from PIL import Image
import imagehash

for img in sorted(image_filenames):
    try:
        hash = imagehash.average_hash(Image.open(img))
    except Exception as e:
        print('Problem:', e, 'with', img)
        continue

    if hash in images:
        print(img, ' already exists as', ' '.join(images[hash]))
    if 'dupPictures' in img:
        print('rm -v', img)
        images[hash] = images.get(hash, []) + [img]

It uses ImageHash, which looks REALLY good and has 3k Stars, so this seems to be the best option.

One thing to note is, that this can calculate the difference between the images, and I would just add it to the settings at which threshhold the images are duplicates or not.

>>> from PIL import Image
>>> import imagehash
>>> hash = imagehash.average_hash(Image.open('tests/data/imagehash.png'))
>>> print(hash)
ffd7918181c9ffff
>>> otherhash = imagehash.average_hash(Image.open('tests/data/peppers.png'))
>>> print(otherhash)
9f172786e71f1e00
>>> print(hash == otherhash)
False
>>> print(hash - otherhash)  # hamming distance
33
# Duplicate Recognition of the images > This comment is only about removing duplicates @Elara6331 said Google uses the simhash algorythm for their web crawler, and it doesn't seem to bad to implement. @Luna found this code in a cli tool, which looks really nice: ```python from PIL import Image import imagehash for img in sorted(image_filenames): try: hash = imagehash.average_hash(Image.open(img)) except Exception as e: print('Problem:', e, 'with', img) continue if hash in images: print(img, ' already exists as', ' '.join(images[hash])) if 'dupPictures' in img: print('rm -v', img) images[hash] = images.get(hash, []) + [img] ``` It uses [ImageHash](https://pypi.org/project/ImageHash/), which looks REALLY good and has 3k Stars, so this seems to be the best option. One thing to note is, that this can calculate the difference between the images, and I would just add it to the settings at which threshhold the images are duplicates or not. ```python >>> from PIL import Image >>> import imagehash >>> hash = imagehash.average_hash(Image.open('tests/data/imagehash.png')) >>> print(hash) ffd7918181c9ffff >>> otherhash = imagehash.average_hash(Image.open('tests/data/peppers.png')) >>> print(otherhash) 9f172786e71f1e00 >>> print(hash == otherhash) False >>> print(hash - otherhash) # hamming distance 33 ```
Author
Owner

👍 so no change for image hashing should be the same libary that you found

👍 so no change for image hashing should be the same libary that you found
Owner

👍 so no change for image hashing should be the same libary that you found

If @Elara6331 or you don't have good arguments that overweigh the pros of this library then yes 👍

But I highly doubt this is controversioal at all

> 👍 so no change for image hashing should be the same libary that you found If @Elara6331 or you don't have good arguments that overweigh the pros of this library then yes 👍 But I highly doubt this is controversioal at all
Owner

If @Elara6331 or you don't have good arguments that overweigh the pros of this library then yes 👍

ImageHash looks good to me :3

> If @Elara6331 or you don't have good arguments that overweigh the pros of this library then yes 👍 ImageHash looks good to me :3
Author
Owner

ArtworkCollection.compile fetches every variant of artworks, and is called once the data-object is compiled (aka on download). There should the duplicate recognition be implemented

there is a slight problem with that, since artwork is an object i cant import download since download itself imports artwork so question would be what i should do with that problem.

> `ArtworkCollection.compile` fetches every variant of artworks, and is called once the data-object is compiled (aka on download). There should the duplicate recognition be implemented > there is a slight problem with that, since artwork is an object i cant import download since download itself imports artwork so question would be what i should do with that problem.
Author
Owner

there is a slight problem with that, since artwork is an object i cant import download since download itself imports artwork so question would be what i should do with that problem.

wouldnt even be that bad just means that i cant check for duplicates in artwork

> there is a slight problem with that, since artwork is an object i cant import download since download itself imports artwork so question would be what i should do with that problem. wouldnt even be that bad just means that i cant check for duplicates in artwork
Owner

Yes I am aware of this issue. Fuck circular imports, so easy to avoid when designing a language. Because of technicalities it works if you import the connection class in the Artwork constructor. Then just create an instance of the class in the constructor right after. Just give it a constant module name. Cuz then I can optimize the collection class to only create one global instance of each module later <333

Yes I am aware of this issue. Fuck circular imports, so easy to avoid when designing a language. Because of technicalities it works if you import the connection class in the Artwork constructor. Then just create an instance of the class in the constructor right after. Just give it a constant module name. Cuz then I can optimize the collection class to only create one global instance of each module later <333
Owner

Yes I am aware of this issue. Fuck circular imports, so easy to avoid when designing a language. Because of technicalities it works if you import the connection class in the Artwork constructor. Then just create an instance of the class in the constructor right after. Just give it a constant module name. Cuz then I can optimize the collection class to only create one global instance of each module later <333

Yes I am aware of this issue. Fuck circular imports, so easy to avoid when designing a language. Because of technicalities it works if you import the connection class in the Artwork constructor. Then just create an instance of the class in the constructor right after. Just give it a constant module name. Cuz then I can optimize the collection class to only create one global instance of each module later <333
Author
Owner

Okay thanks ugly way to do it (not your fault) but it's a way xD thanks for the help ❤️🫂

Okay thanks ugly way to do it (not your fault) but it's a way xD thanks for the help ❤️🫂
Author
Owner

Okay thanks ugly way to do it (not your fault) but it's a way xD thanks for the help ❤️🫂

Okay thanks ugly way to do it (not your fault) but it's a way xD thanks for the help ❤️🫂
Owner

@Elara6331 you gonna love this fix and limitation xD

@Elara6331 you gonna love this fix and limitation xD
Author
Owner

With that issue away I think I'll have it done tonight excepte the metadata stuff not really sure where to put what 🙃

With that issue away I think I'll have it done tonight excepte the metadata stuff not really sure where to put what 🙃
Owner

@Elara6331 you gonna love this fix and limitation xD

Python allows circular imports?!
Also, how does that workaround work?? This makes no sense. Circular imports should just be disallowed completely.

The more I learn about Python, the more convinced I am that it was designed by a drunk 5 year old with no programming experience lol.

> @Elara6331 you gonna love this fix and limitation xD Python allows circular imports?! \ Also, how does that workaround work?? This makes no sense. Circular imports should just be disallowed completely. The more I learn about Python, the more convinced I am that it was designed by a drunk 5 year old with no programming experience lol.
Author
Owner

Still better than java c# and many other languages

Still better than java c# and many other languages
Author
Owner

I hope you meant circular imports shouldn't exist

I hope you meant circular imports shouldn't exist
Author
Owner

Because it's really really pointless and annoying

Because it's really really pointless and annoying
Owner

I hope you meant circular imports shouldn't exist

Yes, that's what I meant. Circular imports should not be a thing.

> I hope you meant circular imports shouldn't exist Yes, that's what I meant. Circular imports should not be a thing.
Author
Owner

I think not even java has that "feature" and that says a lot 😂

I think not even java has that "feature" and that says a lot 😂
Author
Owner

How can it be so hard to import with that annoying import function

How can it be so hard to import with that annoying __import__ function
Author
Owner

I kinda dont know how to fix this i cant really import the connection module because it imports some other modules and it just doesnt work and i amout of ideas

I kinda dont know how to fix this i cant really import the connection module because it imports some other modules and it just doesnt work and i amout of ideas
Author
Owner

Since i am kinda to dumb to handle this i am gona put the compile function into page_attributes if any of you have a better idea just tell me.

Since i am kinda to dumb to handle this i am gona put the compile function into page_attributes if any of you have a better idea just tell me.
Owner

@Luna I can't look at it, if you don't push your changes.

@Luna I can't look at it, if you don't push your changes.
Author
Owner

Sorry

Sorry
Luna added 1 commit 2024-06-10 13:03:13 +00:00
feat:a lot of nonsences
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
4ee6fd2137
Hazel added 2 commits 2024-06-11 12:54:47 +00:00
feat: implemented fetching of artworks on compile
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
274f1bce90
Hazel added 1 commit 2024-06-11 12:58:12 +00:00
fix: circular input
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
dd99e60afd
Owner

@Luna I fixed the add_data function, and I fixed the circular import in a very clean way. Now you should be able to continue <333

@Luna I fixed the `add_data` function, and I fixed the circular import in a very clean way. Now you should be able to continue <333
Author
Owner

Your the bestttt 🥹❤️❤️❤️❤️ thank you so so much

Your the bestttt 🥹❤️❤️❤️❤️ thank you so so much
Author
Owner

Wait wait wait you fixed the circulat import issue with just one line of code ?!!!???
Also does this mean i can do the download in the artwork class?

Wait wait wait you fixed the circulat import issue with just one line of code ?!!!??? Also does this mean i can do the download in the artwork class?
Owner

Wait wait wait you fixed the circulat import issue with just one line of code ?!!!???
Also does this mean i can do the download in the artwork class?

Yes. All I did was only import the objects module for the type checker. Thus you still have the intellisense, but no circular import :)

> Wait wait wait you fixed the circulat import issue with just one line of code ?!!!??? > Also does this mean i can do the download in the artwork class? > > Yes. All I did was only import the objects module for the type checker. Thus you still have the intellisense, but no circular import :)
Author
Owner

That's so crazy your the best ❤️

That's so crazy your the best ❤️
Luna added 1 commit 2024-06-17 12:50:47 +00:00
feat: musify ArtworkCollection simple function
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
17c28722fb
Luna added 1 commit 2024-07-01 13:00:01 +00:00
feat: image hash implemented
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
93c9a367a2
Author
Owner

Musify is done genius and bandcamp should also work but something is majorly wrong with Youtube music invideous isn't even responding i am gona investigate further and write everything that i found but i dont think i am qualified to fix this issue.

Musify is done genius and bandcamp should also work but something is majorly wrong with Youtube music invideous isn't even responding i am gona investigate further and write everything that i found but i dont think i am qualified to fix this issue.
Luna added 1 commit 2024-07-02 15:20:36 +00:00
feat: genius fixes and duplicate detection
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
ci/woodpecker/pull_request_closed/woodpecker Pipeline was successful
5ce76c758e
Author
Owner

Okay i am done it would be great if @Hazel could look over it and remove the wip tag if your happy with the result.

Okay i am done it would be great if @Hazel could look over it and remove the wip tag if your happy with the result.
Luna changed title from WIP: feature/artwork_gallery to feature/artwork_gallery 2024-07-03 11:55:06 +00:00
Hazel merged commit 810aff4163 into experimental 2024-07-15 09:36:22 +00:00
Sign in to join this conversation.
No reviewers
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: music-kraken/music-kraken-core#41
No description provided.