Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
(https://github.com/spookyuser/hacker-reads)
So I'm very curious how you managed to find book titles, I ran into a lot of issues trying to figure out, for example, with "Clean Code" whether to search for "Clean Code" or "Clean Code: A Handbook of Agile Software Craftsmanship" since people mentioning the book used both instances. And of course someone mentioning just "Clean Code" might be referring to the concept not the book. I ended up settling on `${titleMinusColon} - ${author}` but I'd love to know what your approach was given that you used deep learning to search.
I'm very curious how you managed to search for book titles? I tried building something adjacent to this but book titles were already provided since
Thank you!
The medium post is amazingly written! I basically did the same thing - and you beat me with the data augmentation piece. I tried using nlpaug [0] but it didn't improve the model performance. I'll definitely try swapping book titles around.
[0] https://github.com/makcedward/nlpaug
Another book that occasionally gets mentions (51 results for https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...) but appears to be missing on the site AFAICT (not related to 1984, but just came to mind):
https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...
Unrelated to the above, but it would also be nice if the site could search by author (I don't seem to get any hits when putting in author names) or even year of publication.
The whole project is open-source and I already added a few podcast shows with all their book recommendations (I have to add a lot more though): https://github.com/JohannesHa/PodcastBookLibraryMonoRepo