Microsoft introduces ASSERT, an AI testing tool for developers

AI researchers and labs are making great strides in evaluating models for safety, compliance, and robustness. However, companies and developers are facing new challenges in ensuring that custom-built AI systems perform as expected. To simplify this process, Microsoft has announced a new open-source framework called ASSERT (Adaptive Spec-driven Scoring for Evaluation and Regression Testing). This is reported by Techcrunch.com reports .
The ASSERT system allows for the evaluation of AI model behavior based on natural language descriptions. The framework analyzes high-level goals, policies, or expected behaviors and translates them into systematic tests. This enables developers to automatically verify complex scenarios specific to their applications and evaluate results based on scoring.
This tool records the paths taken by the AI system, including intermediate actions and calls to external tools. This helps identify exactly where errors occur. For example, if an AI agent working with documents should not send emails to people outside the company or should only show confidential information to managers, ASSERT continuously checks compliance with these rules.
Microsoft representative Sarah Bird emphasizes that the evaluation process is crucial for making informed decisions. If a system's behavior is not fully understood, it is difficult to know if it meets organizational requirements. ASSERT is a useful tool not only during the development process but also for continuous monitoring after deployment.
This news comes against the backdrop of broader changes in the AI industry. As models become more powerful, researchers are increasingly focusing on iterative testing of models under various conditions through projects like Stanford's HELM or MLCommons' AILuminate.




















Comments 0
…