Gomes, Fernando, Bhansali, Shekhar, da Silveira Maranhão, Fabiola, Valladão, Viviane Silva, & Velasco, Karine. (2025). “What are the future directions for microplastics characterization? A regex-llama data mining approach for identifying emerging trends.” Anais da Academia Brasileira de Ciências, 97, e20241345. https://doi.org/10.1590/0001-3765202520241345
This study presents a new hybrid method to identify and analyze techniques used to study microplastics. By combining pattern-recognition software (regex) with the Llama 3.2:3b language model, we can better detect and understand both traditional and emerging techniques. Established methods like Raman and FTIR spectroscopy are examined alongside advanced tools such as X-ray Photoelectron Spectroscopy (XPS) and Surface-Enhanced Raman Spectroscopy (SERS). This approach improves both the speed and accuracy of identifying complex terms used in microplastics research. Using VOSDataAnalyzer and VOSviewer, we mapped connections and trends among related terms, identifying the 15 most commonly used and emerging techniques. Our analysis shows a shift toward more sensitive and innovative methods in microplastic studies. This Regex-Llama approach, introduced here for the first time, can be applied broadly to tasks such as studying pollutants in the environment, evaluating material breakdown in engineering, and assessing the health impacts of tiny contaminants. Overall, this strategy helps support environmental assessments and guide pollution reduction efforts across multiple fields.

Figure 1. Representation of the chemical structure of the most common polymers found in microplastic pollution, in sequence: Polyethylene (PE), Polypropylene (PP), Polystyrene (PS), and Polyethylene Terephthalate (PET).