Ask HN: What interesting problems are you working on? ( 2022 Edition)

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • poly

    A Go package for engineering organisms.

    It is more like the X Y Z W. However, the X Y Z W bits I am working on as well (https://github.com/TimothyStiles/poly , https://github.com/TimothyStiles/allbase , trilo.bio, freegenes.org). Going for fully automated "make bacterium X produce molecule Y", but still a while away (but surprisingly not THAT far off)

  • multiview

    3D computer vision and action recognition research library (by prcvlabs)

    Multiview, multi-object tracking - https://github.com/prcvlabs/multiview

    I started doing research on cameras around 2012 and have been obsessed every since. The company behind this code failed unfortunately but I've kept poking at it and finding more people interested in it, even as my day job is more about cloud infrastructure. Companies like Zippin which have carved out a niche in the cashierless space really impress me.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • engine262

    An implementation of ECMA-262 in JavaScript

    for an interpreted language like JS, this project is really nice https://github.com/engine262/engine262. More or less 2 parts: parser and evaluator

  • reals

    A lightweight python3 library for arithmetic with real numbers. (by rubenvannieuwpoort)

    Kind of simple/shallow compared to what some other people are posting here, but I'm working on a Python library to compute numerical expressions with arbitrary precision: https://github.com/rubenvannieuwpoort/reals

  • muzero-general

    MuZero

  • preemptible-thread

    How to preempt threads in user space

  • ideas4

    An Additional 100 Ideas for Computing https://samsquire.github.io/ideas4/

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • matano

    Open source security data lake for threat hunting, detection & response, and cybersecurity analytics at petabyte scale on AWS

    I'm building an open source security lake platform (https://github.com/matanolabs/matano). Basically, some of the core problems we solve are:

    - Traditional SIEM tools are not a good fit for large amounts of data — they're either too expensive or come with a high ops burden.

  • Jocko

    Kafka implemented in Golang with built-in coordination (No ZK dep, single binary install, Cloud Native)

    Still early stages but building on top of this. - https://github.com/travisjeffery/jocko

  • peerreview

    A diamond open access (free to access, free to publish), open source scientific and academic publishing platform.

    I'm working open source and would welcome contributions! (https://github.com/danielbingham/peerreview)

    (Although, the first contribution would probably need to be getting the local working again in a new context... I've been going fast and taking on some techdebt that will need to be paid down soon.)

  • Benthos

    Fancy stream processing made operationally mundane

    I'm contributing to a data streaming processor called Benthos: https://www.benthos.dev/ It's written in Go and I really love the project since it's so simple and, unlike most Apache projects, it's just one single static binary which does the heavy lifting and it's stateless.

  • studio

    Discontinued Robotics visualization and debugging (by foxglove)

    Web-based data visualization for robotics and self-driving. Robotics is such an interesting industry, and we're only scratching the surface of what new tools are needed.

    Try it live here (hit "view sample data"): https://studio.foxglove.dev/

    And it's open source! https://github.com/foxglove/studio

    Shameless plug - we're hiring: https://foxglove.dev/careers

  • dnachips

    Github here - https://github.com/koeng101/dnachips

    It is a bit dead when it comes to a git, since I am trying to get the first machine needed, which is a DNA synthesizer (asked about that here https://groups.google.com/g/diybio/c/V3OYVBxaH04 for example).

    The idea is that with a traditional DNA synthesizer I can have positive controls of the chemistry, and develop a chip that can fit inside the flow cell of the existing synthesizer. In biotech, everything goes wrong a bit more often than in computer science, so the focus lately has been getting my hands on a working synthesizer. This is a tried and true method of getting chip synthesis working.

    If that works, I'd like to provide the chip at cost for integrators, as well as develop a functioning full product for integration with some bots I'm building for my official work.

    Personal website is here - http://keonigandall.com

  • ts-pg-orm

    Delightful Typescript PostgreSQL ORM

    * PostgreSQL doesn't have some of the features that you would expect from other DB technologies (i.e. certain table alterations, limit on update and delete, etc.). Weird and wonderful workarounds exist, but it always leaves you thinking _why aren't those workarounds just the de facto way for doing those things?_.

    [0] https://github.com/samhuk/ts-pg-orm

  • ark

    Go REST API to replace Genbank, Uniprot, Rhea, and CHEMBL (by bebop)

    It is more like the X Y Z W. However, the X Y Z W bits I am working on as well (https://github.com/TimothyStiles/poly , https://github.com/TimothyStiles/allbase , trilo.bio, freegenes.org). Going for fully automated "make bacterium X produce molecule Y", but still a while away (but surprisingly not THAT far off)

  • Lemmy

    🐀 A link aggregator and forum for the fediverse

    This is a brilliant idea! If you think the problem is bad in the US, you haven't stepped into spaces like Asia. This is a path that can help the world, not just EU/US scenarios.

    May I also suggest you consider the network layer centralization. When you mentioned Github+Stackoverflow, I got the point even before I visited your site.

    However, even as you think about an alternative on how we publish, consider that technical questions can have significant political consequences. I am of the view that centralized networks are a major contribution to the situation we find ourselves in today. Distributed/Decentralized/Federated options like ActivityPub may help in your journey in what surely is a great idea. Check Lemmy, for example, on a real stackoverflow option - https://join-lemmy.org/ and Gitea already working on a federated "Github".

  • FlatBuffers

    FlatBuffers: Memory Efficient Serialization Library

    - Moved files: How to not lose all customized information (rating, playlists, playcount, etc.) when a file gets moved to another place?

    Many filesystems try to solve the same problem (eg: customize the appearance of files in a particular folder [1]). One solution is adding extended file attributes [2], however this might not be supported on all operating systems.

    [1] https://en.wikipedia.org/wiki/.DS_Store

    [2] https://en.wikipedia.org/wiki/Extended_file_attributes

    A slower but more portable solution might be content-addressable storage. Basically, create a directory containing just metadata files for each song. Name each file as the SHA256 sum of the associated music file, and put metadata into it in a binary format like flatbuffers [3] or Cap'n Proto [4] or a plaintext format like TOML [5] if you prefer to make the system human-editable at the cost of lower performance. Even after moving a file to another location, the SHA256 sum of the file should not change.

    Note that if you have duplicated files, then there might be hash collisions where you'll have to reconcile metadata differences (or you can just merge the metadata together, keeping attributes with the later timestamp). There are various solutions to this as well like building a parallel directory structure which mirrors your music filesystem, but that can get complicated.

    [3] https://google.github.io/flatbuffers/

    [4] https://capnproto.org/

    [5] https://toml.io/en/

    - File-Watchers: How to prevent fully indexing the filesystem over and over again and only react to changes?

    When first loading a directory of music into the program, build a merkle tree [6] of the files' hashes and save them to the content-addressable storage directory described above if they do not already exist. Once indexing is complete, serialize the merkle trees for each directory as well, this way the next time the program starts, you can just load these up and check for consistency of the files in the background. Then set up FileSystemWatcher [7] to notify you when the contents of a directory changes, and update the metadata files and merkle trees accordingly.

    [6] https://en.wikipedia.org/wiki/Merkle_tree

    [7] https://stackoverflow.com/questions/721714/notification-when...

  • Cap'n Proto

    Cap'n Proto serialization/RPC system - core tools and C++ library

    - Moved files: How to not lose all customized information (rating, playlists, playcount, etc.) when a file gets moved to another place?

    Many filesystems try to solve the same problem (eg: customize the appearance of files in a particular folder [1]). One solution is adding extended file attributes [2], however this might not be supported on all operating systems.

    [1] https://en.wikipedia.org/wiki/.DS_Store

    [2] https://en.wikipedia.org/wiki/Extended_file_attributes

    A slower but more portable solution might be content-addressable storage. Basically, create a directory containing just metadata files for each song. Name each file as the SHA256 sum of the associated music file, and put metadata into it in a binary format like flatbuffers [3] or Cap'n Proto [4] or a plaintext format like TOML [5] if you prefer to make the system human-editable at the cost of lower performance. Even after moving a file to another location, the SHA256 sum of the file should not change.

    Note that if you have duplicated files, then there might be hash collisions where you'll have to reconcile metadata differences (or you can just merge the metadata together, keeping attributes with the later timestamp). There are various solutions to this as well like building a parallel directory structure which mirrors your music filesystem, but that can get complicated.

    [3] https://google.github.io/flatbuffers/

    [4] https://capnproto.org/

    [5] https://toml.io/en/

    - File-Watchers: How to prevent fully indexing the filesystem over and over again and only react to changes?

    When first loading a directory of music into the program, build a merkle tree [6] of the files' hashes and save them to the content-addressable storage directory described above if they do not already exist. Once indexing is complete, serialize the merkle trees for each directory as well, this way the next time the program starts, you can just load these up and check for consistency of the files in the background. Then set up FileSystemWatcher [7] to notify you when the contents of a directory changes, and update the metadata files and merkle trees accordingly.

    [6] https://en.wikipedia.org/wiki/Merkle_tree

    [7] https://stackoverflow.com/questions/721714/notification-when...

  • sdcv

    On the commandline using sdcv[0] with a generate stardict from wiktionary[1] is a great combo.

    https://github.com/Dushistov/sdcv

  • ebook-reader-dict

    Finally decent dictionaries based on Wiktionary for your beloved eBook reader.

  • how-to-synbio

    The resources I always recommend to new synthetic biologists.

    That's really cool! I was about to ask how to get started on that field, but I also noticed that you have that covered in https://github.com/TimothyStiles/how-to-synbio

  • m4b-tool

    m4b-tool is a command line utility to merge, split and chapterize audiobook files such as mp3, ogg, flac, m4a or m4b

  • beets

    music library manager and MusicBrainz tagger

  • tone

    tone is a cross platform audio tagger and metadata editor to dump and modify metadata for a wide variety of formats, including mp3, m4b, flac and more. It has no dependencies and can be downloaded as single binary for Windows, macOS, Linux and other common platforms.

  • libu8ident

    unicode security guidelines for identifiers

  • nbperf

    Improved NetBSD's Perfect Hash Generation Tool v3

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts