The study aimed to validate the quality of assessment items generated by Large Language Models for use in TIMSS fourth grade mathematics and science assessment. This publication includes the report, psychometric analysis supplement, and five annex documents. The IEA Research and Development call three has supported the research for this publication.