-
NLP-CNN-Subreddit-Sorter-Heroku-App
End-to-end development of an application using a convolutional neural network that suggests to users/moderators which technical subreddit a post actually belongs to. Novel method to determine # of CNN filters. Custom Word2vec embeddings. The subreddits chosen are all technical and similar, and benefit users/moderators interested in data science and related fields. (Exploratory data analysis, feature engineering, custom word2vec embeddings, convolutional neural network, deployment via flask to
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
This app could be expanded to include other similar technical subreddits and serve as a way to decide where to crosspost, or for moderators to auto flag posts that are off topic. Here is the repo: https://github.com/djthorne333/NLP-CNN-Subreddit-Sorter-Application, and link to the app: https://datascience-reddit-post-sorter.herokuapp.com/. I think I thought of a way to extract from the dataset the optimal amount of filters to use for each filter size for the CNN. I have some typos to fix right now it seems, but it's generally done. Please let me know what you think, and give me any advice, as I am trying to break into data science.