pypandoc
nbdev
pypandoc | nbdev | |
---|---|---|
5 | 45 | |
804 | 4,744 | |
- | 0.5% | |
6.8 | 6.5 | |
11 days ago | 3 days ago | |
Python | Jupyter Notebook | |
GNU General Public License v3.0 or later | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
pypandoc
-
Web Scraping in Python – The Complete Guide
I recently used [0] Playwright for Python and [1] pypandoc to build a scraper that fetches a webpage and turns the content into sane markdown so that it can be passed into an AI coding chat [2].
They are both very gentle dependencies to add to a project. Both packages contain built in or scriptable methods to install their underlying platform-specific binary dependencies. This means you don't need to ask end users to use some complex, platform-specific package manager to install playwright and pandoc.
Playwright let's you scrape pages that rely on js. Pandoc is great at turning HTML into sensible markdown. Below is an excerpt of the openai pricing docs [3] that have been scraped to markdown [4] in this manner.
[0] https://playwright.dev/python/docs/intro
[1] https://github.com/JessicaTegner/pypandoc
[2] https://github.com/paul-gauthier/aider
[3] https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turb...
[4] https://gist.githubusercontent.com/paul-gauthier/95a1434a28d...
## GPT-4 and GPT-4 Turbo
- GitHub Accelerator: our first cohort and what's next
-
Converting multiple docx to multiple txt filed
Use Pypandoc
nbdev
- The Jupyter+Git problem is now solved
-
What is literate programming used for?
One example I've seen is ML/DL folks using jupyter notebooks to develop DL libraries in jupyter notebooks, see https://github.com/fastai/nbdev
-
GitHub Accelerator: our first cohort and what's next
- https://github.com/fastai/nbdev: Increase developer productivity by 10x with a new exploratory programming workflow.
-
Startups are in first batch of GitHub OS Accelerator
9. Nbdev: Boost developer productivity with an exploratory programming workflow - https://nbdev.fast.ai/
-
Start learning python for a Statistician with SAS experience and little R experience
See if you like nbdev way of working with data through python and jupyter. nbdev is an optional part that will create python packages from jupyter notebooks. Also even the simple tutorials are opinionated and will guide you to unit test your code and write CICD pipelines.
- FastKafka - free open source python lib for building Kafka-based services
-
isn't this just too much for a take home assignment?
You probably don’t have time for this for the purposes of your task, but I will also throw in the recommendation of nbdev especially if you’re a Python person. I haven’t had a project to use it on yet, but I’ve gone through the docs and the walkthrough and it seems like a great framework for starting potential projects with all the infrastructure needed for if/when they eventually get big and need all the packaging and stuff
-
Any experience dealing with a non-technical manager?
nbdev: jupyter notebooks -> python package
-
Resources to bridge the gap between jupyter notebooks and regular python development
Take a look at https://github.com/fastai/nbdev - haven't used it but supposedly the whole if fast.ai library was written that way. It sounds like a natural direction in your scenario - allowing your to keep working in a familiar environment and still producing production ready code (will, at least in paper 😅)
- Rant: Jupyter notebooks are trash.
What are some alternatives?
taffy - A high performance rust-powered UI layout library
papermill - 📚 Parameterize, execute, and analyze notebooks
sniffnet - Comfortably monitor your Internet traffic 🕵️♂️
ploomber - The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️
formbricks - Open Source Survey Platform
dbt - dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications. [Moved to: https://github.com/dbt-labs/dbt-core]
nuxt - The Intuitive Vue Framework.
jupytext - Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts
trpc - 🧙♀️ Move Fast and Break Nothing. End-to-end typesafe APIs made easy.
rr - Record and Replay Framework
responsively-app - A modified web browser that helps in responsive web development. A web developer's must have dev-tool.
Jupyter-PowerShell - Jupyter Kernel for PowerShell