Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor :...
Transcript of Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor :...
![Page 1: Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor : Ewald Peizer Vienna University of Technology Institute.](https://reader036.fdocumentos.tips/reader036/viewer/2022062307/552fc10b497959413d8c20f5/html5/thumbnails/1.jpg)
Automatic Audio Segmentation : Segment Boundary and Structure Detection in
Popular Music
Autor : Ewald Peizer
Vienna University of Technology Institute of Software Technology and Interactive Systems
![Page 2: Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor : Ewald Peizer Vienna University of Technology Institute.](https://reader036.fdocumentos.tips/reader036/viewer/2022062307/552fc10b497959413d8c20f5/html5/thumbnails/2.jpg)
Introdução
Segmentação automática de áudio tem como objetivo extrair informações sobre a estrutura de uma música.
A estrutura de uma música é composta de : Introdução Verso Refrão Bridge, etc.
![Page 3: Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor : Ewald Peizer Vienna University of Technology Institute.](https://reader036.fdocumentos.tips/reader036/viewer/2022062307/552fc10b497959413d8c20f5/html5/thumbnails/3.jpg)
Introdução
O tema deste artigo é um subcampo da MIR ( Music Information Retrieval ) que visa extrair informações sobre a estrutura musical de canções.
![Page 4: Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor : Ewald Peizer Vienna University of Technology Institute.](https://reader036.fdocumentos.tips/reader036/viewer/2022062307/552fc10b497959413d8c20f5/html5/thumbnails/4.jpg)
Introdução
Estas informações podem ser usadas em várias aplicações práticas:
1 - Facilitar a navegação ou consulta em grandes coleções de músicas.
2 - Query - by - humming
![Page 5: Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor : Ewald Peizer Vienna University of Technology Institute.](https://reader036.fdocumentos.tips/reader036/viewer/2022062307/552fc10b497959413d8c20f5/html5/thumbnails/5.jpg)
Algoritmo
Este artigo apresenta um algoritmo de duas fases :
1 – Segment Boundaries
2 – Structure Detection
![Page 6: Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor : Ewald Peizer Vienna University of Technology Institute.](https://reader036.fdocumentos.tips/reader036/viewer/2022062307/552fc10b497959413d8c20f5/html5/thumbnails/6.jpg)
Segment Boundaries
São extraídos features do sinal do áudio. Spectogram Mel Frequency Cepstrum Coeficients Rhythm Patterns Statistical Spectrum Descriptors Constant Q Transform
![Page 7: Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor : Ewald Peizer Vienna University of Technology Institute.](https://reader036.fdocumentos.tips/reader036/viewer/2022062307/552fc10b497959413d8c20f5/html5/thumbnails/7.jpg)
J. Foote. Automatic audio segmentation using a measure of audio novelty
![Page 8: Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor : Ewald Peizer Vienna University of Technology Institute.](https://reader036.fdocumentos.tips/reader036/viewer/2022062307/552fc10b497959413d8c20f5/html5/thumbnails/8.jpg)
Segment Boundaries Usa um algoritmo proposto por Foote que usa
Gaussian novelty score emergindo picos candidatos.
![Page 9: Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor : Ewald Peizer Vienna University of Technology Institute.](https://reader036.fdocumentos.tips/reader036/viewer/2022062307/552fc10b497959413d8c20f5/html5/thumbnails/9.jpg)
Structure Detection
A saída da 1ª fase do algoritmo serve como entrada para a 2ª fase ( detecção da estrutura).
A 2ª fase do algoritmo tenta detectar a estrutura da canção.
Os segmentos possuem representação distinta, ou seja :
A – intro B – chorus C - verse D - bridge, etc..
![Page 10: Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor : Ewald Peizer Vienna University of Technology Institute.](https://reader036.fdocumentos.tips/reader036/viewer/2022062307/552fc10b497959413d8c20f5/html5/thumbnails/10.jpg)
Struture Detection
A estrutura de uma canção pode ser deduzida através destes rótulos.
Presumimos que os segmentos do mesmo tipo são representados por features similares.
Assim é executado técnicas de cluster não supervisionadas.
![Page 11: Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor : Ewald Peizer Vienna University of Technology Institute.](https://reader036.fdocumentos.tips/reader036/viewer/2022062307/552fc10b497959413d8c20f5/html5/thumbnails/11.jpg)
Struture Detection
Means – of - frames
![Page 12: Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor : Ewald Peizer Vienna University of Technology Institute.](https://reader036.fdocumentos.tips/reader036/viewer/2022062307/552fc10b497959413d8c20f5/html5/thumbnails/12.jpg)
Avaliação
Para poder comparar os resultados com outras pesquisas foi utilizado um merge de vários corpus.
Total = 108 canções, maior CORPUS já utilizado em um algoritmo de AAS.
![Page 13: Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor : Ewald Peizer Vienna University of Technology Institute.](https://reader036.fdocumentos.tips/reader036/viewer/2022062307/552fc10b497959413d8c20f5/html5/thumbnails/13.jpg)
Evalution
![Page 14: Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor : Ewald Peizer Vienna University of Technology Institute.](https://reader036.fdocumentos.tips/reader036/viewer/2022062307/552fc10b497959413d8c20f5/html5/thumbnails/14.jpg)
Avaliação - Ambiguidade
A estrutura de uma música é ambígua, sendo assim não é trivial para avaliar os resultados do algoritmo
Foi realizada uma avaliação através de um modelo hierárquico de dois níveis.
![Page 15: Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor : Ewald Peizer Vienna University of Technology Institute.](https://reader036.fdocumentos.tips/reader036/viewer/2022062307/552fc10b497959413d8c20f5/html5/thumbnails/15.jpg)
Avaliação - Ambiguidade
![Page 16: Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor : Ewald Peizer Vienna University of Technology Institute.](https://reader036.fdocumentos.tips/reader036/viewer/2022062307/552fc10b497959413d8c20f5/html5/thumbnails/16.jpg)
Resultados
![Page 17: Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor : Ewald Peizer Vienna University of Technology Institute.](https://reader036.fdocumentos.tips/reader036/viewer/2022062307/552fc10b497959413d8c20f5/html5/thumbnails/17.jpg)
Resultados
![Page 18: Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor : Ewald Peizer Vienna University of Technology Institute.](https://reader036.fdocumentos.tips/reader036/viewer/2022062307/552fc10b497959413d8c20f5/html5/thumbnails/18.jpg)
Audio Segmentation File Format
Foi introduzido um novo formato baseado em XML – SegmXML
Este arquivo contém informações de segmentaçãoes hierárquicas para promover uma base comum para que futuros resultados sejam melhor comparáveis.
![Page 19: Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor : Ewald Peizer Vienna University of Technology Institute.](https://reader036.fdocumentos.tips/reader036/viewer/2022062307/552fc10b497959413d8c20f5/html5/thumbnails/19.jpg)
Limitações
Frequentemente há um pico gerado por novelty score que significa a mudança de um instrumento.
Canções com um som de guitarra distorcido e denso parece ter um pior resultado do que canções melódicas.
![Page 20: Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor : Ewald Peizer Vienna University of Technology Institute.](https://reader036.fdocumentos.tips/reader036/viewer/2022062307/552fc10b497959413d8c20f5/html5/thumbnails/20.jpg)
Conclusões
O algoritmo, provou ser robusto em um sentido negativo e positivo :
Muitos experimentos realizados com vários ajustes de parâmetros e novas heurísticas aplicadas não levaram a nehuma melhora de desempenho estatístico.
Por outro lado validação cruzada e de performance em um conjunto de teste independente não mostrou nenhuma queda de desempenho.
![Page 21: Automatic Audio Segmentation : Segment Boundary and Structure Detection in Popular Music Autor : Ewald Peizer Vienna University of Technology Institute.](https://reader036.fdocumentos.tips/reader036/viewer/2022062307/552fc10b497959413d8c20f5/html5/thumbnails/21.jpg)
Referências
[Foo00] J. Foote. Automatic audio segmentation using a measure of audio novelty.In Proc. ICME, volume 1, New York City, New York, USA, 2000.
[ANS+05] S. Abdallah, K. Noland, M. Sandler, M. Casey, and C. Rhodes. Theoryand evaluation of a Bayesian music structure extractor. In Proc. ISMIR2005, pages 420{425, London, UK, 2005.
[Cha05] W. Chai. Automated analysis of musical structure. PhD thesis, Mas-sachusetts Institute of Technology, MA, USA, September 2005.
[HSG06]C. Harte, M. Sandler, and M. Gasser. Detecting harmonic change inmusical audio. In Proc. ACMMM, pages 21{26, Santa Barbara, California,USA, 2006. ACM Press New York, New York, USA.
[PK06] J. Paulus and A. Klapuri. Music structure analysis by nding repeatedparts. In Proc. ACMMM, pages 59{68, Santa Barbara, California, USA,2006. ACM Press New York, New York, USA.