Ask HN: I have 10 yrs of Exp. Failed 4 takehome projects. What am I doing wrong?

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • takehome-sample

  • >If your argument is that the lambda gets optimized out, or that the benchmarked difference is insignificant (it very well could be!), then I could understand that.

    In interpreted languages I believe the thing that gets saved to mem is a pointer to the code itself. So no inherit difference here in terms of allocation. For closures reference counting is increased for the referenced variables but this is not a closure.

    Still this is a minor thing. Scoped functions are used for structural purposes performance benefits of using or not using them are negligible.

    >logN is not faster than O(1) - NLogN is in the API, not redis

    My point is log(n) is comparable to o(1). It's that fast.

    NlogN is not comparable to O(n). In fact if you look at that pic NlogN is more comparable to O(n^2).

    You definitely don't want NlogN on the application server whether its nodejs or python. That will block the python or node thread completely if n is large.

    It's preferable to use SORT in redis then in python or node because redis is highly optimized for this sort of thing (punintended) as it's written in C. Even if its not in redis, in general for backend web development you want to move as much compute towards the database and off the webserver. The webserver is all about io and data transfer. The database is where compute and processing occurs. Keep sorting off web servers and leave it all in the database. You're a nodejs dev? This is more important for node given that the default programming model is single threaded.

    Overall it's just better to use sets with scores because logN is blazingly fast. It's so fast that for databases it's often preferential to use logN tree based indexing over hashed indexing even when hashed indexing is appropriate.

    Yes hashed indexing are O(1) average case insert and read but the memory savings of tree based indexing overshadows the negligible logN cost.

    >I think my version is the least surprising one -- no one has to know about pipeline or worry about atomicity. Just an O(1) operation to redis, like most people would expect to see.

    Two things here.

    1st. Sorting on the web API is definitively wrong. This is especially true in nodejs where the mantra is always non blocking code.

    Again the overall mantra for backend web is for compute to happen via the database.

    2nd. Pipeline should 100 percent be known. Atomic operations are 100 percent required across shared storage and it's expected to occur all the time. Database transactions are fundamental and not unexpected. Pipeline should be extremely common. This is not an obscure operation. I commented about it in my code in case the reviewer was someone like you who isn't familiar with how popular atomic database transactions are, but this operation is common it is not a obscure trick.

    Additionally pipeline has nothing to do with your or my own implementation. It's there to make atomic a state mutation and retrieval of that state.

    I add one to a score then I retrieve the score and I want to make sure no one changes the score in between the retrieval and the add one. This is needed regardless of what internal data structure is used in redis.

    >I don't know if they tried to run your test, but it could have failed with a 3xx

    That'd be stupid in my opinion. 302 redirect is something I expected. Maybe I should have stated that explicitly in the docs.

    >IMO -- it's more realistic and you have full control.

    Unlikely. Even as a lead it's better to cater to the majority opinion of the team. 99 percent of the time you never have full control.

    Developers and leads typically evolve to fit into the overall team culture while inserting a bit of their own opinions.

    I'm a rust and Haskell developer applying for a python job. Do I apply my entire opinion to control the team? No, I adapt to all the bad (in my opinion) and good practices of the team. I introduce my own ideas slowly and where appropriate.

    > it is a mystery why they wouldn't give feedback.

    This is easy. It's not a mystery. It's because they don't give a shit about me. They don't want to hire me so the business relationship is over. Legally they owe me zero time to explain anything and it's more efficient for the business to not waste time on feedback.

    >Yes, but this is a small API -- you literally have to write a test that hits the server once. There are libs for doing this with flask, there is documentation showing you how. It's not rocket science, and it's crucial to catching bugs down the road.

    No there isn't. You have to spin up redis. Flask documentation doesn't talk about that. Integration testing involves code that controls docker-compose. It's a bunch of hacky scripts with external process calls that are needed to get integration tests working.

    It's not rocket science but not within the scope of a take-home. Additionally hacking with docker compose is not sustainable or future proof, eventually you will hit infra that can't be replicated with docker.

    I will probably do it next time just to cater to people who share your viewpoints.

    >If the prompt was "write this like you're at a startup that has no money and no time", then sure.

    Lol. The prompt was write it in four hours ( no time) and they aren't paying me for it ( no money).

    >Specs did specify test cases -- and none of them had a trailing slash.

    What? In the specs all examples have trailing slashes. All of them.

    https://github.com/anonanonme/takehome-sample/blob/master/RE...

    Take a look again, you remembered wrong.

    What the specs didn't specify was what to do when there is no trailing slash. So why not a 302? The client can choose what to do here. Either follow the redirect or treat it as an error.

  • minio

    The Object Store for AI Data Infrastructure

  • >Again, here you seem to be arguing against a strawman that doesn't know that blocking the IO loop is bad. Try arguing against one that knows ways to work around that. This is why I'm saying this rule isn't true. Extensive computation on single-threaded "scripting" languages is possible (and even if it wasn't, punt it off to a remote pool of workers, which could also be NodeJS!).

    Very rare to find a rule that's absolutely true.. I clearly stated exceptions to the rule (which you repeated) but the generality is still true.

    Threading in nodejs is new and didn't exist since the last time I touched it. It looks like it's not the standard use case as google searches still have websites with titles saying node is single threaded everywhere. The only way I can see this being done is multiple Processes (meaning each with a copy of v8) using OS shared memory as IPC and they're just calling it threads. It will take a shit load of work to make v8 actually multi-threaded.

    Processes are expensive so you can't really follow this model per request. And we stopped following threading per request over a decade ago.

    Again these are exceptions to the rule, from what I'm reading Nodejs is normally still single threaded with a fixed number of worker processes that are called "threads". Under this my general rule is still generally true: backend engineering does no typically involve writing non blocking code and offloading compute to other sources. Again, there are exceptions but as I stated before these exceptions are rare.

    >Here's what I mean -- you can actually solve the ordering problem in O(N) + O(M) time by keeping track of the max you've seen and building a sparse array and running through every single index from max to zero. It's overkill, but it's generally referred to as a counting sort:

    Oh come on. We both know these sorts won't work. These large numbers will throw off memory. Imagine 3 routes. One route gets 352 hits, another route gets 400 hits, and another route gets 600,000 hits. What's Big Oh for memory and sort?

    It's O(600,000) for both memory and runtime. N=3 and it doesn't even matter here. Yeah these types of sorts are almost never used for this reason, they only work for things with smaller ranges. It's also especially not useful for this project. Like this project was designed so "counting sort" fails big time.

    Also we don't need to talk about the O(N) read and write. That's a given it's always there.

    >I don't think these statements make sense -- having docker installed and having redis installed are basically equivalent work. At the end of the day, the outcome is the same -- the developer is capable of running redis locally. Having redis installed on your local machine is absolutely within range for a backend developer.

    Unfortunately these statements do make sense and your characterization seems completely dishonest to me. People like to keep their local environments pure and segregated away from daemons that run in a web server. I'm sure in your universe you are claiming web developers install redis, postgresql and kafka all locally but that just sounds absurd to me. We can agree to disagree but from my perspective I don't think you're being realistic here.

    >Also, remote development is not practiced by many companies -- the only companies I've seen doing thin-clients that are large.

    It's practiced by a large amount and basically every company I've worked at for the past 5 years. Every company has to at least partially do remote dev in order to fully test E2E stuff or integrations.

    >I see it as just spinning up docker, not compose -- you already have access to the app (ex. if it was buildable via a function) so you could spawn redis in a subprocess (or container) on a random port, and then spawn the app.

    Sure. The point is it's hacky to do this without an existing framework. I'll check out that library you linked.

    >I agree that integration testing is harder -- I think there's more value there.

    Of course there's more value. You get more value at higher cost. That's been my entire point.

    >Also, for replicating S3, minio (https://github.com/minio/minio) is a good stand-in. For replicating lambda, localstack (https://docs.localstack.cloud/user-guide/aws/lambda/) is probably reasonable there's also frameworks with some consideration for this (https://www.serverless.com/framework/docs/providers/aws/guid...) built in.

    Good finds. But what about SNS, IOT, Big Query and Redshift? Again my problem isn't about specific services, it's about infra in general.

    >Ah, this is true -- but I think this is what people are testing in interviews. There is a predominant culture/shared values, and the test is literally whether someone can fit into those values.

    No. I think what's going on is people aren't putting much thought into what they're actually interviewing for. They just have some made up bar in their mind whether it's a leetcode algorithm or whether the guy wrote a unit test for the one available pure function for testing.

    >Whether they should or should not be, that's at least partially what interviews are -- does the new team member feel the same way about technical culture currently shared by the team.

    The answer is no. There's always developers who disagree with things and just don't reveal it. Think about the places you worked at. Were you in total agreement? I doubt it. A huge amount of devs are opinionated and think company policies or practices are BS. People adapt.

    >Now in the case of this interview your solution was just fine, even excellent (because you went out of your way to do async io, use newer/easier packaging methodologies, etc), but it's clearly not just that.

    The testing is just a game. I can play the game and suddenly I pass all the interviews. I think this is the flaw with your methodology as I just need to write tests to get in. Google for example in spirit attempted another method which involves testing IQ via algorithms. It's a much higher bar

    The problem with google is that their methodology can also be gamed but it's much harder to game it and often the bar is too high for the actual job the engineer is expected to do.

    I think both methodologies are flawed, but hiring via ignoring raw ability and picking people based off of weirdly specific cultural preferences is the worse of the two hiring methodologies.

    Put it this way. If a company has a strong testing culture, then engineers who don't typically test things will adapt. It's not hard to do, and testing isn't so annoying that they won't do it.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • testcontainers-python

    Testcontainers is a Python library that providing a friendly API to run Docker container. It is designed to create runtime environment to use during your automatic tests.

  • > ...

    I think this is where we're talking past each other, so let me explain more of how I see the problem -- the solution I have in mind is serializing the URL and using ONE call to INCR (https://redis.io/commands/incr/) on the ingest side.

    There is a lot you can do with the data storage pattern to make other operations more efficient, but on the stats side, the most basic way you can do it is to scan

    I will concede that given that we know the data should fit in memory (otherwise you just crash) your approach gives you O(N) retrieval time and it's definitely superior to not have to do that on the python side (and python just streaming the response through). I am comfortable optimizing in-API computation, so I don't think it's a problem.

    Here's what I mean -- you can actually solve the ordering problem in O(N) + O(M) time by keeping track of the max you've seen and building a sparse array and running through every single index from max to zero. It's overkill, but it's generally referred to as a counting sort:

    https://ebrary.net/81651/geography/sorting_algorithms

    This is overkill, clearly, and I will concede that ZSET is lighter and easier to get right than this.

    > You linked? Where? I'd like to know about any library that will do this. Tell me of any library that does integration tests that spins up infrastructure for you. The only one closest I can think of that you can run locally is anything that would use docker-compose or some other IAC language that controls containers. I honestly don't think any popular ones exist.

    https://testcontainers-python.readthedocs.io/en/latest/

    I am very sure that I linked that, but in the case I didn't, here it is again -- hope you find it useful.

    > No way I'm going to assume the user has redis installed on his local machine. Many devs don't. It's all remote development for them or everything lives in containers.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts