Scientific Computing with Perl

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • Photonic

    Photonic and metamaterials calculations

  • Here is a link to the PDL book <http://pdl.perl.org/content/pdl-book-toc.html>.

    I can share some examples of using PDL:

    - Demos of basic usage <https://metacpan.org/release/ETJ/PDL-2.050/source/Demos/Gene...>

    - Image analysis <https://nbviewer.ipython.org/github/zmughal/zmughal-iperl-no...> (I am also the author of IPerl, so if you have questions about it, let me know. My top priority with IPerl right now is to make it easy to install.)

    - Physics calculations <https://github.com/wlmb/Photonic>

    - Access to GSL functions for integration and statistics (with comparisons to SciPy and R): <https://gist.github.com/zmughal/fd79961a166d653a7316aef2f010...>. Note how PDL can take an array of values as input (which gets promoted into a PDL of type double) and then returns a PDL of type double of the same size. The values of that original array are processed entirely in C once they get converted to a PDL.

    - Example of using Gnuplot <https://github.com/PDLPorters/PDL-Graphics-Gnuplot/blob/mast...>.

    ---

    Just to give a summary of how PDL works relative to XS:

    PDL allows for creating numeric ndarrays of any number of dimension of a specific type (e.g., byte, float, double, complex double) that can be operated on by generalized functions. These functions are compiled using a DSL called PP that generates multiple XS functions by taking a signature that defines the number of dimensions that the function operates over for each input/output variable and adding loops around it. These loops are quite flexible and can be made to work in-place so that no temporary arrays are created (also allows for doing pre-allocation). The loops will run multiple times over that same piece of memory --- this is still fast unless you have many small computations.

    And if you do have many small computations, the PP DSL is available for the user to use as well so if they need to take a specific PDL computation written in Perl, they can translate the innermost loop into C and then it can do the whole computation in one loop (a faster data access pattern). There is a book for that as well called "Practical Magick with C, PDL, and PDL::PP -- a guide to compiled add-ons for PDL" <https://arxiv.org/abs/1702.07753>.

    ---

    I'm also active on the `#pdl` IRC channel on <https://www.irc.perl.org/>, so feel free to drop by.

  • PDL-Graphics-Gnuplot

    Gnuplot-based plotting backend for PDL

  • Here is a link to the PDL book <http://pdl.perl.org/content/pdl-book-toc.html>.

    I can share some examples of using PDL:

    - Demos of basic usage <https://metacpan.org/release/ETJ/PDL-2.050/source/Demos/Gene...>

    - Image analysis <https://nbviewer.ipython.org/github/zmughal/zmughal-iperl-no...> (I am also the author of IPerl, so if you have questions about it, let me know. My top priority with IPerl right now is to make it easy to install.)

    - Physics calculations <https://github.com/wlmb/Photonic>

    - Access to GSL functions for integration and statistics (with comparisons to SciPy and R): <https://gist.github.com/zmughal/fd79961a166d653a7316aef2f010...>. Note how PDL can take an array of values as input (which gets promoted into a PDL of type double) and then returns a PDL of type double of the same size. The values of that original array are processed entirely in C once they get converted to a PDL.

    - Example of using Gnuplot <https://github.com/PDLPorters/PDL-Graphics-Gnuplot/blob/mast...>.

    ---

    Just to give a summary of how PDL works relative to XS:

    PDL allows for creating numeric ndarrays of any number of dimension of a specific type (e.g., byte, float, double, complex double) that can be operated on by generalized functions. These functions are compiled using a DSL called PP that generates multiple XS functions by taking a signature that defines the number of dimensions that the function operates over for each input/output variable and adding loops around it. These loops are quite flexible and can be made to work in-place so that no temporary arrays are created (also allows for doing pre-allocation). The loops will run multiple times over that same piece of memory --- this is still fast unless you have many small computations.

    And if you do have many small computations, the PP DSL is available for the user to use as well so if they need to take a specific PDL computation written in Perl, they can translate the innermost loop into C and then it can do the whole computation in one loop (a faster data access pattern). There is a book for that as well called "Practical Magick with C, PDL, and PDL::PP -- a guide to compiled add-ons for PDL" <https://arxiv.org/abs/1702.07753>.

    ---

    I'm also active on the `#pdl` IRC channel on <https://www.irc.perl.org/>, so feel free to drop by.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • numpysane

    more-reasonable core functionality for numpy

  • I used perl and PDL heavily, before moving to Python and numpy. Both have annoying issues, and oddly, their warts are complementary. Particularly, the core API in PDL is miles better than numpy's. Before I could tolerate actually using numpy, I had to write a library to patch away numpy's warts, by effectively writing a PDL compatibility layer. Check it out:

    https://github.com/dkogan/numpysane/

    Now the core numpy has usable broadcasting, concatenation and basic linear algebra. Kudos to the PDL team for the excellent core design.

  • CPython

    The Python programming language

  • See also my blog post "How you average numbers matters"[2].

    > Now, in the real world, you have programs that ingest untold amounts of data. They sum numbers, divide them, multiply them, do unspeakable things to them in the name of “big data”. Very few of the people who consider themselves C++ wizards, or F# philosophers, or C# ninjas actually know that one needs to pay attention to how you torture the data. Otherwise, by the time you add, divide, multiply, subtract, and raise to the nth power you might be reporting mush and not data.

    > One saving grace of the real world is the fact that a given variable is unlikely to contain values with such an extreme range. On the other hand, in the real world, one hardly ever works with just a single variable, and one can hardly every verify the results of individual summations independently.

    Correct algorithms may be slower, but I am hoping that it is easy understand why they ought to be preferred.

    [1]: https://github.com/python/cpython/blob/5571cabf1b3385087aba2...

  • mce-cookbook

    Discontinued Cookbook for Many-Core Engine

  • Good ideas!

    a)

    A built-in way would be good. There is some work being explored in using OpenMP with Perl/PDL to get some of that. In the mean time, there is MCE which does distribute across processes and there are examples of using this with PDL <https://github.com/marioroy/mce-cookbook#sharing-perl-data-l...>, but I have not had an opportunity to use it.

    b)

    Output for a spreadsheet would be difficult if I understand the problem correctly. This would more about creating a mapping of PDL function names to spreadsheet function names --- not all PDL functions exist in spreadsheet languages. It might be possible to embed or do IPC with a Perl interpreter like <https://www.pyxll.com/>, but I don't know about how easy that would be to deploy when distributing to users.

    Am I understanding correctly?

    Interestingly enough, creating a mapping of PDL functions would be useful for other reasons, so the first part might be possible, but the code might need to be written in a certain way that makes writing the dataflow between cells easier.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Utilizing Coverage AI Agents for Better Unit Tests

    2 projects | dev.to | 20 May 2024
  • You Can Set Up a Home Security Camera System Without Using the Cloud

    1 project | news.ycombinator.com | 20 May 2024
  • A Command line memorable password generator. Now in Python.

    1 project | dev.to | 20 May 2024
  • MISP galaxy – cybersecurity and other related knowledge base

    1 project | news.ycombinator.com | 20 May 2024
  • Ask HN: Most successful example using LLMs in daily work/life?

    1 project | news.ycombinator.com | 20 May 2024