top of page

Open Positions

Software Engineer | AI Platform (Ref 23046)

About us

At Programize, we partner with teams of all sizes - from startups to established enterprises - across industries and continents to create innovative, high-impact software products. We don’t just implement requirements; we turn ambitious ideas into marketable software solutions we are genuinely proud to put our names on. With 200+ successfully delivered projects behind us, we’ve tackled everything from greenfield architectures to complex, large-scale platforms.

Our vision is to become the go-to company for entrepreneurs and engineers, who want to design and develop impactful, scalable software systems. 

To achieve that, we need talented professionals to join our team, to share the thrill for technology and innovation.

The Role

We are looking for a Software Engineer to join our team and collaborate with an international organization building an advanced AI agent evaluation platform.

The platform is designed to test, evaluate, and benchmark AI agents at scale, helping organizations understand and improve the behavior, reliability, and performance of AI-powered systems. You will design and build production-grade services, APIs, and data-driven systems that power AI evaluation workflows. While you will work closely with LLMs and agent-based applications, the focus of the role is on engineering excellence, system reliability, and building scalable software solutions.

You will have the opportunity to own features end-to-end, collaborate with experienced engineers, and contribute to the next generation of AI-powered products.


What You Will Do

  • Backend services & APIs. Build and maintain the services, data models, and APIs that power the platform - designed for correctness, testability, and scale.

  • Simulation & orchestration. Work on the systems that coordinate complex, multi-step interactions between AI agents and external systems, improving their reliability and throughput.

  • Evaluation & scoring. Design systems that grade agent outputs, combining deterministic checks with model-assisted judgment - and make scoring reliable, explainable, and reproducible.

  • Data pipelines. Build pipelines that generate, transform, and quality-check large volumes of structured data and benchmark content.

  • Quality & reliability. Add the tests, instrumentation, and safeguards needed to trust outputs from systems that are inherently non-deterministic.


What You Have

  • 4+ years building and shipping production software, with strong proficiency in Python.

  • Deep software engineering fundamentals: system and API design, data modeling, concurrency/async, testing strategy, debugging, and code review. You can own a non-trivial end-to-end service.

  • Experience designing and operating distributed or service-oriented systems (queues, workers, APIs), not just calling them.

  • Comfort designing schemas and working with relational databases, plus the migrations and performance concerns that come with them.

  • Working knowledge of LLM APIs, orchestration, structured outputs, and handling non-determinism. We expect you to use LLMs effectively, but this is not a prompt-engineering role.

  • Ability to reason about correctness of probabilistic systems: how to test, measure, and trust outputs that aren't byte-for-byte deterministic.

  • High quality bar: you write tests, types, and docs by default, and you keep changes small and reviewable.
     

Nice to have

  • Experience building agentic or multi-agent systems, tool-use, or orchestration frameworks.

  • Background in evaluation / benchmarking of ML or LLM systems (rubrics, golden datasets, model-as-judge, inter-rater reliability).

  • Experience with distributed task queues and async workloads.

  • Modern Python tooling and typed codebases (e.g. type checkers, linters, Pydantic, FastAPI).

  • Retrieval / search experience and working with data ingest pipelines.

  • Some comfort with the infra side (Docker, CI/CD) so you can ship what you build.
     

What to expect from us
Programize was founded on the values of respect and appreciation for customers and colleagues alike. We believe in equal opportunity, diversity, flexibility, hard work and continuous improvement in all aspects of our company. We want our people to feel happy, creative, productive and motivated. So, in Programize you will find the following:

  • Friendly, respectful and appreciative working environment.

  • Competitive remuneration package.

  • On-site and remote working options.

  • Lab-like, collaborative, and engaging environment

  • Continuous learning and growth opportunities.

  • International working environment.

  • Work-life balance.

  • Private health insurance plan, including dependents.
     

 

Disclaimer:
Programize collects and processes personal data in accordance with the EU General Data Protection Regulation (GDPR). We are bound to use the information provided within your job application for recruitment purposes only and not to share these with any unauthorized third parties, and all applications will be treated as strictly confidential.

bottom of page