GoCoMA Framework Enhances Code Attribution for LLMs
GoCoMA introduces a multimodal framework aimed at addressing challenges in large language model (LLM) code attribution. By modeling an extrinsic hierarchy encompassing both higher-level code stylometry and lower-level binary images, the framework significantly enhances the ability to identify the source of generated code. Employing advanced techniques like geodesic-cosine similarity-based fusion, GoCoMA demonstrates superior performance against existing baselines on benchmarks such as CoDET-M4 and LLMAuthorBench.
The strategic implications of GoCoMA are crucial as they reflect the growing necessity for clarity in AI-generated content. By improving how we attribute code to its generative source, the framework not only enhances security protocols but also addresses licensing ambiguities. This advancement is essential for regulators and developers aiming to navigate the complex landscape of AI-generated software, ensuring ethical and compliant usage in diverse applications.