Senior Software Engineer – LLM Evaluation
Evaluate AI-generated code and improve performance, reliability, and scalability of LLM outputs.
Evaluate AI-generated code and improve performance, reliability, and scalability of LLM outputs.
Analyze datasets and develop insights using Python to improve AI model accuracy and performance.
Evaluate machine learning systems through benchmarking and improve model performance using real-world datasets.
Create clear and structured technical documentation and insights to support AI datasets and model understanding.
Develop scalable Python applications for AI model training, evaluation, and deployment.
Annotate images and videos to improve computer vision models and enhance AI understanding of visual data.
Evaluate Python codebases and GitHub issues while improving software quality for AI training and validation workflows.
Record high-quality Spanish voiceovers for AI training datasets, focusing on clarity, tone, and linguistic accuracy.
Create engaging video content showcasing real-world environments to enhance AI understanding of visual and contextual scenarios.
Evaluate personalization quality in AI systems using cultural and contextual understanding of Japanese user behavior.
Build enterprise-grade generative AI systems using knowledge graphs, LLMs, and scalable architectures for production environments.
Design and craft multi-turn conversational datasets to improve AI agent reasoning, function calling accuracy, and real-world interaction capabilities.