I must admit, I've never been a big fan of coding simple algorithms as part of the interview process. I think the questions are too shallow, and the tooling used to validate the answers too simple. Today, when a friend sent me a copy of a question they were asked as part of an interview process. Ignoring the fact this person was interviewing for a Chief Technical Officer role, the question is incredibly bad. I want to break down why I think it's bad, and how I think you'd get a much better outcome from the process.

The question is such that you must provide code to solve the problem, in any number of languages including C, C++, Python, Javascript, Typescript, Ruby, Rust, Go etc. The tooling will scan your code, test it and provide you a percentage based score.

This is the question:

Write a function: function solution(MyList: number[]) : number;

that, given an array of MyList of N integers, returns the smallest positive integer (greater than 0), that does not occur in MyList.

For example, given MyList = [1,3,6,4,1,2], the function should return 5. Given MyList = [1,2,3], the function should return 4. Given MyList = [-1,-3], the function should return 1.

Write an efficient algorithm for the following assumptions:

N is an integer within the range [1.. 100,000].

each element of array MyList is an integer within the range

Note: I don't know why the assumption is that all elements are positive, yet the third example has negative values

Before I venture in to the discussion of why this sucks let me clarify a few things:

I am not the best programmer. I write ok code, but I do try to write clean - My Github
I primarily write code in C++, PHP/Laravel and Python.
I am reasonably proficient in Javascript, and can read Typescript.

The junior response

Looking at the question, we can take the mindset of a junior programmer and code up an answer. In my mind, it would look something like this:

def solution(MyList):
  MyList = [x for x in MyList if x > 0] # Remove 0 and negative numbers
  MyList.sort() # Sort array
  smallest = 1
  for num in MyList:
    if num == smallest:
      smallest += 1
    elif num > smallest:
      break 
  return smallest

It's not bad right. It's a small piece of code that is easy to follow and gets the job done. If we have an interest in Big-O then these are the following results:

Sorting: O(N log N)
Loop: O(N)
Total (N log N)

The optimal solution

Let's ask ChatGPT for an optimal solution here.

def solution(MyList):
    num_set = set(x for x in MyList if x > 0)

    for i in range(1, len(MyList) + 2):  # We go up to N + 1
        if i not in num_set:
            return i

This solution is nice. The code is short, easy to follow and handles negative numbers well.

Looking at Big-O we get:

Create the set: O(N)
Check for missing element: O(N)
Total: O(N)

Is this really so bad?

Both solutions are ok and produce accurate outputs. But the question itself lacks a large amount of necessary context. This question does nothing to give us an understanding of the potential candidates expertise. It does however prove that they can use ChatGPT or vibe-code.

Let's instead ask ourselves. What is so bad about this question?

Efficient is not defined

The question asks us to write an efficient solution, but doesn't define whether efficient relates to speed or space. Do we want a fast solution, or one that doesn't require extra memory?

The optimal solution will be different depending on what we value more, speed or space.

Solving for Speed

If we assume that the purpose of the question is to write the fastest code possible, then providing an example function signature in Typescript feels out of place. Typescript is an interpreted languages that is orders of magnitude slower than compiled languages (e.g., C, C++, Rust). Typescript also gives us an indication of potential runtime environment. We'd be considering either client-side in the browser, or server-side in a NodeJS type configuration.

Client-side performance is often less of a concern than server-side because it's a naturally distributed workload. Having our code run in the browsers locally for 10,000 users is relatively straight forward. The considerations we'd need to make to run the same code server-side for 10,000 requests would be fundamentally different (e.g., caching results).

Solving for Space

What if we're working in an embedded ecosystem where space is the limiting factor? We'd need to consider how the sorting algorithms work and any extra memory that might be consumed during the process. The type of CPU and it's feature set (pre-fetching, caching, branch prediction) would then become a consideration as well. We'd likely want to consider the overall complexity of the solution and implement performance monitoring tooling to help us build an environment specific optimal solution (e.g., profiling for cache misses in real world data set executions).

What we can say though is that in this scenario, the optimal solution is no longer optimal because it needs to allocate memory. The optimal solution is not optimal because even the python .sort() method allocates memory (up to O(N)).

We could swap to a slower method that allocates no extra memory. This would be much slower O(N^2) but would have O(1) space allocation. No extra memory would be required.

Bounds are not defined

While the solution does give us the range for values in MyList, it doesn't give us the length bounds of MyList. Will MyList always be less than 10 values? Or can it be 10 million? Or potentially unlimited?

Does this matter? Yes. A few things we'd need to consider:

What sorting algorithm would be best based on the expected number of elements? Perhaps we want one that is slower on fewer elements, but faster with many.
Larger arrays require more memory, copying or using sorting algorithms with higher memory requirements are going to be less space performant.
Can we use threading? fibers? GPU/CUDA/OpenCL? It'd be beneficial if we had 500 billion elements in the array.
In languages like C and C++, we can choose to allocate memory on the stack or heap. Generally speaking, the stack is quicker but has space limitations, while the heap is slower and can support a much much larger allocation of memory. In the middle, objects like std::vector use heap allocation but guarantee contiguous memory layout like the stack.
Can the array have no elements? What do I return then?

What the const?

Probably the most interesting part of the problem space for me is the const-ness of the parameter MyList. What authority does the code within the solution() function have to modify the parameter MyList? Typescript, like most languages, will pass an array by reference instead of value. This means that any filtering or sorting we do on the array will permanently modify the array outside of the function. The scope of this example is to analyse the array, not modify it. I would think modification of the parameter MyList is bad and shouldn't be allowed.

If we pick Go as our language of choice, then the array MyList is passed in my value. This results in a copy operation which is non-performant and not caught in the Big-O analysis, but also means that any changes we make to the MyList array within the solution function are done so locally only. The calling function will not have any modified values. If we pick C, C++, Python or PHP then the MyList array is passed by reference and modifications made in the solution function are done globally. What about Rust I hear you ask? Rust passes it by reference but makes you explicitly mark the variable as mutable before you can change it.

The problem is that we do not know if we are ALLOWED to modify the contents of the parameter MyList or not. And this matters because data integrity matters.

Conclusion

Asking someone to write code like this is pointless. They will head to insert favourite LLM like ChatGPT and just ask it to provide the answer. The code will look good and pass the test, but the developer does not have a chance to discuss their solution. Without providing the necessary extra information and context, all solutions are going to be non-optimal. The candidate should not be penalised for choosing space over speed for example.

The process should reward the candidate for their understanding of potential concerns and risks. You can vibe-code a speed optimal solution, or a space optimal solution. But the LLM won't prompt you to consider the const-ness, or the bounds of the input parameter. Set your candidates up to succeed. Give them a project like DuckDuckGo or give them the code and ask them to provide a list of questions they would ask, concerns they'd have with the question being asked.

For what it's worth, I dropped the question in to ChatGPT and asked it to provide me an optimal solution. It picked speed over space and ignored const-ness. What good is a test if the candidate cannot demonstrate anything better than the default answer from ChatGPT?

This coding interview question sucks