De positie
For a leading global semiconductor and AI technology organization, we are looking for an AI Engineer passionate about Generative AI and Agentic AI systems, with a strong focus on optimizing models for on-device deployment.
In this role, you will work with Large Language Models (LLMs), Large Multimodal Models (LMMs), and Vision-Language-Action (VLA) models, ensuring they run efficiently on NPU-accelerated edge platforms. The role focuses on model compression, system optimizations, and agentic capabilities such as function calling and tool orchestration.
Experience designing secure and reliable agentic workflows (e.g., guardrails, safe tool invocation) is considered a strong advantage.
Responsibilities
Model Optimization
- Optimize LLMs and multimodal models for on-device deployment
- Apply techniques such as quantization (8-bit, 4-bit, mixed precision), pruning, and distillation for NPU-based edge targets
Inference Performance
- Develop system optimizations such as speculative decoding and other efficient decoding algorithms for edge environments
Agentic AI
- Investigate methods to improve small language models enabling tiny agents operating directly on edge devices
- Implement solutions aligned with AI safety principles
Deployment
Deploy optimized models using tools such as:
- Ollama
- llama.cpp
- ONNX Runtime
- TensorFlow Lite (TFLite)
Benchmarking
- Design pipelines to benchmark Generative and Agentic AI systems running on-device
Prototyping
Build proof-of-concepts and demonstrators for applications such as:
- Industrial safety monitoring
- In-cabin sensing
- Other edge AI use cases
Productization
- Translate research and optimization techniques into production-ready solutions
- Collaborate with product teams to integrate solutions into software and hardware platforms
Over het bedrijf
NXP Semiconductors is a global semiconductor company delivering solutions for automotive, industrial & IoT, mobile, and communications infrastructure. With operations in 30+ countries, NXP focuses on technologies that make the connected world better, safer, and more secure. The company reported $12.61B revenue in 2024.
Wat breng jij
Education
- MSc, PhD, or EngD in Computer Science or a related technical field
Experience
- 5+ years experience in software or AI engineering
- Strong exposure to LLMs, multimodal models, and system performance optimization
AI & Model Optimization
Experience with:
- LLM quantization techniques such as SmoothQuant, SpinQuant, QuaRoT
- Pruning approaches such as Wanda, SparseGPT
- Speculative decoding and system-level optimization techniques
AI Frameworks
Experience with:
Agentic AI
Familiarity with frameworks such as:
- LangChain
- Google ADK
- SmolAgents
Knowledge of agent safety and security mechanisms (guardrails, policy enforcement, secure function calling) is considered a plus.
AI Toolchains / Inference
Experience with tools such as:
- CUDA
- TensorRT
- TFLite
- ONNX
- Ollama
Embedded Systems
- Experience working with embedded systems and NPU accelerators
- Familiarity with embedded software architecture, build systems, and version control
Systems & Tooling
Experience with:
- GNU/Linux
- Development boards and processors
- Software development environments
Additional experience with ML-Ops tooling (MLFlow, ClearML) is beneficial.
Knowledge of Yocto, OpenEmbedded, and ARM cross-compilation toolchains is a plus.
Programming
Strong programming skills in:
Communication
- Excellent English communication skills
- Experience working in multi-site and multicultural teams preferred