CPython
Pandas
Our great sponsors
- InfluxDB - Access the most powerful time series database as a service
- Sonar - Write Clean Python Code. Always.
- ONLYOFFICE ONLYOFFICE Docs — document collaboration in your environment
CPython | Pandas | |
---|---|---|
1193 | 371 | |
53,542 | 38,499 | |
2.7% | 0.9% | |
10.0 | 10.0 | |
3 days ago | 4 days ago | |
Python | Python | |
GNU General Public License v3.0 or later | BSD 3-clause "New" or "Revised" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
CPython
-
Show HN: Word2vec Algorithm in ~100sloc with NumPy
Hi eesmith, thanks for your insight! I've applied a lot of the improvements you mentioned, the collections.Counter one didn't even cross my mind- very neat:)
The lines are processed independently as they are separate sentences. One line may be from one source, and the next from a completely different source. The problem with simply concatenating them is that the target words on the boundaries between sentences can end up with context words from unrelated sentences.
>I liked seeing the 'η' - I haven't yet used non-ASCII in my code!
Then you may enjoy this april fool's python issue I made https://github.com/python/cpython/issues/103172
Yeah, I like using utf-8 chars for mathematical symbols, it makes my brain hurt a little less when mapping between the two! I also like using ŷ for predictions in ML contexts as that's the canonical symbol used in the literature.
-
Is there a way to build a project from source with the same process between Windows and OSX?
So I wanted to test this out with Python, as I want to embed Python to allow support for scripting in my app. However, as you can see in the Readme here, the build process for Windows and MacOS is a bit different (Mac involves running Make, while Windows just runs a bat file). Neither are complicated, but I originally assumed that the process would be basically the same, and I could use a CMakeLists that wasn't platform specific.
-
Building and deploying a web API powered by ChatGPT
We'll also end up using Python, Docker and Git.
-
Never again
Anything using inspect.getargspec due to this change.
-
I'm interested in making a game, but don't know where to start. Advice would be appreciated!
There’s not a single best route but I’d suggest spending an evening with Scratch, then learn a bit of Python via Codecademy. After that, I recommend either Godot, Java, or C++.
-
Python __init__ Vs __new__ Method - With Examples
It can be useful in some uncommon use cases to make some, let's say, "magic" classes. For example, the enum module in the stdlib makes use of __new__ to create enum classes.
-
beginner
There’s not a single best route but I’d suggest spending an evening with Scratch, then learn a bit of Python via Codecademy. After that, I recommend either Godot, Java, or C++.
- Pharo 11
-
How to Get Started with Open Source
Python
-
Help me guide my boyfriend
Here are links to things that I talked about in this comment. Python (High Level Programming language) https://www.python.org/, Visual Studio Code (A Code Editor) https://code.visualstudio.com/, Stack Overflow (Programming Questions) https://stackoverflow.com/, W3 Schools (Learning Programming Languages) https://www.w3schools.com/.
Pandas
-
Beaver: a common lisp library for data analysis and manipulation
Hello there folks! I decided to create a data analysis library modeled after pandas, as all things are, this library isn't perfect. It currently only supports a simple CSV, and serializes it into a 2D matrix. Here is currently how it looks
-
How do I get Local LLM to analyze an whole excel or CSV?
I think that the model should be able to understand to use a tool like [pandas](https://pandas.pydata.org/) and not to analyze the data with it's capabilities.
-
Why are physics undergrads told to "learn programming" and what does this consist of?
pandas: you mention employability, and this is one of the most powerful ways you can wrangle with data in Python, say as a data analyst. I have used it for some of my research projects because it allows you to collect elements from a data table easily based on shared characteristics or a custom function and plot/perform statistical analysis on them.
-
A Polars exploration into Kedro
Traditionally Kedro has favoured pandas as a dataframe library because of its ubiquity and popularity. This means that, for example, to read a CSV file, you would add a corresponding entry to the catalog:
-
Pandas AI – The Future of Data Analysis
I asked GPT-4 this
can you visit https://pandas.pydata.org/about/governance.html and tell me if I am allowed to use the term 'pandas' in the name of another unaffiliated project, for example 'pandas-ai'
--
Based on the BSD 3-Clause License under which pandas is released, neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission[ 1] . This means that to use the term 'pandas' in the name of another unaffiliated project such as 'pandas-ai', you would likely need to get written permission from the pandas project's copyright holders.
However, please note that this is not legal advice, and it would be a good idea to consult with a lawyer who specializes in open-source software or intellectual property law to ensure that you're in compliance with all legal requirements.
-
Benchmarking for Pandas and Polars Using CSV and Parquet File
I have updated this issue https://github.com/pola-rs/polars/issues/8533, please kindly help to solve it. I have also sent similar issues to Pandas https://github.com/pandas-dev/pandas/issues/53249
-
PSA: You don't need fancy stuff to do good work.
Before diving into advanced machine learning algorithms or statistical models, we need to start with the basics: collecting and organizing data. Fortunately, both Python and R offer a wealth of libraries that make it easy to collect data from a variety of sources, including web scraping, APIs, and reading from files. Key libraries in Python include requests, BeautifulSoup, and pandas, while R has httr, rvest, and dplyr.
-
[OC] Analyzing 15,963 Job Listings to Uncover the Top Skills for Data Analysts (update)
Analysis was done in Jupyter Notebook with Python 3.10, Pandas, Matplotlib, wordcloud and Mercury framework.
-
[OC] Data Analyst Skills in need based on 15,963 job listings
Analysis was done in Jupyter Notebook with Python 3.10 kernel, Pandas, Matplotlib, wordcloud and Mercury framework to share notebook as a web application with widgets and code hidden. Gif created in Canva.
-
dictf - An extended Python dict implementation that supports multiple key selection with a pretty syntax.
Speaking of pandas: groupby used to treat x and [x] the same way. Now it treats them differently, but still is forced to make the decision whether a value is scalar or iterable. Maybe in 10 years we will get another flavor of the idea ? Which one is best ? That sort of "design roaming" is quite symptomatic of that sort of API, for a good reason: there is no winning solution, it will always be broken by design: https://github.com/pandas-dev/pandas/pull/47761
What are some alternatives?
Cubes - Light-weight Python OLAP framework for multi-dimensional data analysis
tensorflow - An Open Source Machine Learning Framework for Everyone
RustPython - A Python Interpreter written in Rust
orange - 🍊 :bar_chart: :bulb: Orange: Interactive data analysis
Airflow - Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
pyexcel - Single API for reading, manipulating and writing data in csv, ods, xls, xlsx and xlsm files
Keras - Deep Learning for humans
SymPy - A computer algebra system written in pure Python
Pytorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration
Dask - Parallel computing with task scheduling
NumPy - The fundamental package for scientific computing with Python.
blaze - NumPy and Pandas interface to Big Data