Job Overview: Short-Term Contract (4 weeks) for a Backend Engineer focusing on multi-agent benchmark tasks.
Duties and Responsibilities: Build multi-agent benchmark tasks based on real-world open-source code changes such as bug fixes, migrations, and refactors. Work with the Harbor evaluation framework to run and validate tasks inside Docker environments. Write clear and precise task instructions specifying file paths, function signatures, expected behavior, and constraints. Design and implement Python-based verification scripts to validate correctness of agent-generated code changes. Create decomposition strategies that split complex code changes across multiple independent sub-agents. Run, debug, and refine tasks within containerized environments to ensure reproducibility and determinism. Evaluate task performance signals and improve task quality, clarity, and difficulty. Contribute to benchmark development for advanced AI coding agents.
Required Qualifications: Strong years of experience in Python and JavaScript development. Experience with AI coding benchmarks (e.g., SWE-bench, Terminal-Bench). Strong experience reading and navigating large open-source codebases (e.g., Django, Flask, FastAPI, Node.js, or similar). Familiarity with Git workflows including pull requests, diffs, cherry-picking, and working with specific commits. Comfortable with Docker including writing Dockerfiles, building images, and debugging container issues. Experience writing test scripts using pytest, unittest, or custom assertion-based testing. Ability to write clear, precise, and unambiguous technical specifications. Ability to work independently in a remote environment.
Additional Notes: Compensation: $15 per hour. Commitment: 20-40 hours per week with 4 hours overlap with PST. Application Process: Apply/Easy Apply and check email for application form, fill Google form, assessment link after shortlisting to be completed within 24 hours.
Info
Job Posting Disclaimer
All job postings on this site are shared for informational purposes only. The responsibility for the accuracy of job descriptions, requirements, qualifications, and other details rests entirely with the employer or organization offering the position. We do not verify or guarantee the authenticity of these listings.
Applicants are encouraged to perform their own due diligence and confirm all information directly with the employer before submitting an application.
We are not responsible for any actions, decisions, or outcomes resulting from applying to a job listed here. All interviews, selection processes, and job offers are conducted solely by the employer or organization.
Exercise caution and watch out for fraudulent job offers. Never provide sensitive personal information or make payments to secure a position.