Abstract : Finding exact clones in source code can be e ciently handled using classical exact substring or subtree pattern match- ing techniques inspired from genomics applications. These methods may be wisely employed as a foundation to sketch new techniques highlighting duplicated code chunks present- ing minor edits or more extensive modi cations at a higher structural scale. The main goal is to improve recall of small near matches and to aggregate them into larger ones to pro- vide a more global view of similarities with a reasonable complexity. These concerns are essential to be able to ad- dress a large database of source code projects.
Michel Chilowicz, Étienne Duris, Gilles Roussel. Towards a multi-scale approach for source code approximate match report. 4th International Workshop on Software Clones (IWSC'10), May 2010, Cape Town, South Africa. pp.89-90. ⟨hal-00620368⟩