← all jobs

Mathematics Model Prompt Evaluator

Work from home Full-time role Hiring

Role Overview

We are seeking expert mathematicians to author and verify high-quality open-ended prompts for AI model evaluation. You will craft and review challenging, unambiguous mathematical problems across core subdomains, assessing AI reasoning quality and helping establish rigorous evaluation standards for frontier language models.

  • *You will be assigned one of two task types:**

• *Authoring Task** Create 5 original, open-ended prompts from your assigned subdomain at varying difficulty levels (undergraduate, advanced undergraduate, or graduate/professional). Prompts should require human judgment to evaluate the quality of the AI's response, such as chain-of-thought reasoning or proof construction.

  • *Verification Task**

Review 5 authored prompts for clarity, scope alignment, difficulty accuracy, and uniqueness. Edit prompts and difficulty ratings where needed.

  • *Mathematics Subdomains Covered**

Probability & Statistics, Algebra (incl. Linear Algebra), Ordinary/Partial Differential Equations & Dynamical Systems, Geometry, Graph Theory, Number Theory.

  • *Key Responsibilities**

- Author clear, unambiguous, open-ended mathematical prompts that elicit evaluable AI responses - Verify prompts are within the scope of the assigned subdomain and correctly rated for difficulty - Ensure all 5 prompts in a task are sufficiently distinct from one another with varying difficulty levels - Apply expert judgment to assess the depth and quality of mathematical reasoning required - Edit prompts and difficulty assignments where standards are not met

  • *Ideal Qualifications**

- Master's degree or higher in Mathematics, Applied Mathematics, Statistics, or a closely related field - 2–6 years of professional or research experience in a quantitative field - Strong command of graduate-level mathematical concepts including proof writing, analysis, and formal reasoning - Experience in academic research, mathematical competition design, or quantitative industry roles is a plus - Excellent written English and ability to craft precise, well-scoped technical questions

  • *More About the Opportunity**

- Expected commitment: 10+ hours/week - Asynchronous, fully remote work

More open positions

Safety Evaluator

Work from home Full-time role

CRM Product Owner, GTM

Work from home Full-time role

Manager Project Management - Remote

Work from home Full-time role

Project Manager - Remote & US Based

Work from home Full-time role

Technical Project Manager II - REMOTE

Work from home Full-time role

[Hiring] Research Associate, Healthcare & Human Services @Mathematica Inc

Work from home Full-time role

Full‑Time & Part‑Time Remote Data Entry & Customer Service Specialist – Home‑Based Typing & Administrative Support

Work from home Full-time role

Remote Customer Experience Specialist – Live Chat Support Representative (Work From Home)

Work from home Full-time role

Customer Service Professional – Hybrid Inbound Support Specialist (Downey, CA)

Work from home Full-time role

Remote Jobs No Degree or Experience | Live Customer Service

Work from home Full-time role

Senior Manager, Infrastructure Services

Work from home Full-time role

Remote Part‑Time Data Entry Clerk & Administrative Assistant – Full‑Remote, Flexible Schedule, Competitive Pay & Comprehensive Benefits

Work from home Full-time role

Experienced Full Stack Senior Supervisor – Visual Effects (VFX) Pipeline Management

Work from home Full-time role

Site Activation Manager assigned to Client

Work from home Full-time role

Experienced Remote Chat Support Agent – Flexible Hours, No Phone, and Weekly Pay

Work from home Full-time role

Patient Access Specialist - REMOTE - Must Reside in Pacific Time Zone

Work from home Full-time role

REMOTE* Tax Advisory Services - Advisor/Manager - to $150k+

Work from home Full-time role

Principal Workforce Management Capacity Planner

Work from home Full-time role

Experienced Live Chat Assistant – Remote Customer Support Specialist

Work from home Full-time role

[Remote] Fall 2026 Legal Intern, Disability Rights Program

Work from home Full-time role

Experienced Customer Service Representative – Online Data Entry and Store Operations

Work from home Full-time role