learning the ropes

things I made at ITP and after: sketches, prototypes, and other documentation

Friday, March 30, 2007

Musical Speech

Another version of the musical speech patch. I spoke with Peter last week about fffb~ and made it work — to some degree.

I ran into trouble finding the frequency bin with the maximum energy. Jonathan Marcus helped me with a solution to that problem using the zl object.

The patch makes some sounds now, but still not what I hoped for. I wanted a patch that would highlight the musicality of recorded speech. I expect it would work better on some voices than others, but so far it just sounds “random.”

  • When the .wav file is silent, the patch plays the last frequency in the frequency transform list (to fix, I will need to detect that case and turn off the cycle~ object)
  • I tried testing the patch with the cycle~ object as an input, thinking that a pure tone would be a good way to test the patch’s ability to recognize frequencies. As I changed the frequencies of the cycle~ object going into the patch, there were some spots where the pitch detection “blew up” and returned to the highest frequency in my list.
  • More experimentation needs to be done with voice recordings to see how they respond. I will post some results soon.

fft spectrum

Using the patch
- Enable the DAC
- Click the “open” message to load a file into the sfplay~ object
- Click the “1″ attached the sfplay~ object to start playback
- The following controls modify the output: transform strength, wave volume, and transform volume
- “Transform Strength” controls the scaling of the values coming out of the fffb~. Increasing the transform strength with cause larger numbers to be packed into the frequency bin energy list. This primarily affects the height of the peaks on the multislider display.
- “Wave Volume” and “Transform Volume” adjust the relative sound levels of the original .wav file and the transformed signal. Set transform volume higher than wave volume to emphasize the transformed sound.


  • To try out this patch, simply copy the following lines and paste them into MAX/MSP

    #P window setfont “Sans Serif” 9.;
    #P window linecount 9;
    #P comment 399 18 143 9109513 fffb~ is used here to break the audio from the .wav file into separate frequency bands. The band with the most energy will choose the frequency to be played. In this way \, I hope to create a system which highlights the melodic aspects of human speech.;
    #P window linecount 1;
    #P newex 70 478 33 9109513 *~ 0.5;
    #P flonum 110 430 35 9 0 0 0 139 0 0 0 221 221 221 222 222 222 0 0 0;
    #P flonum 234 551 35 9 0 0 0 139 0 0 0 221 221 221 222 222 222 0 0 0;
    #P flonum 246 105 35 9 0 0 0 139 0 0 0 221 221 221 222 222 222 0 0 0;
    #P flonum 459 301 35 9 0 0 0 139 0 0 0 221 221 221 222 222 222 0 0 0;
    #P flonum 529 300 35 9 0 0 0 139 0 0 0 221 221 221 222 222 222 0 0 0;
    #P flonum 580 301 35 9 0 0 0 139 0 0 0 221 221 221 222 222 222 0 0 0;
    #P message 248 370 395 9109513 60. 120. 180. 240. 440. 510 530 600 660 700 725 800 880 900 910 940 1000 1060 1120 1180;
    #P newex 185 489 35 9109513 zl reg;
    #P newex 185 456 72 9109513 t b i;
    #P newex 184 520 35 9109513 zl nth;
    #P newex 185 428 36 9109513 zl sub;
    #P newex 211 403 65 9109513 maximum 0.;
    #P newex 185 379 36 9109513 t l l;
    #P newex 185 613 33 9109513 *~ 0.8;
    #P newex 184 579 36 9109513 cycle~;
    #P flonum 184 555 35 9 0 0 0 139 0 0 0 221 221 221 222 222 222 0 0 0;
    #P message 106 77 14 9109513 0;
    #P message 87 77 14 9109513 1;
    #P message 47 77 28 9109513 open;
    #P newex 524 257 33 9109513 * 128.;
    #P newex 524 230 29 9109513 avg~;
    #P newex 484 257 33 9109513 * 128.;
    #P newex 484 230 29 9109513 avg~;
    #P newex 441 257 33 9109513 * 128.;
    #P newex 444 230 29 9109513 avg~;
    #P newex 404 257 33 9109513 * 128.;
    #P newex 404 230 29 9109513 avg~;
    #P newex 364 257 33 9109513 * 128.;
    #P newex 364 230 29 9109513 avg~;
    #P newex 324 257 33 9109513 * 128.;
    #P newex 324 230 29 9109513 avg~;
    #P newex 284 257 33 9109513 * 128.;
    #P newex 284 230 29 9109513 avg~;
    #P newex 244 257 33 9109513 * 128.;
    #P newex 244 230 29 9109513 avg~;
    #P toggle 176 77 15 0;
    #P newex 176 105 50 9109513 metro 100;
    #P toggle 13 77 15 0;
    #P newex 201 258 33 9109513 * 128.;
    #P newex 201 231 29 9109513 avg~;
    #P newex 161 258 33 9109513 * 128.;
    #P newex 161 231 29 9109513 avg~;
    #P newex 121 258 33 9109513 * 128.;
    #P newex 121 231 29 9109513 avg~;
    #P newex 185 326 160 9109513 pack 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.;
    #P newex 81 258 33 9109513 * 128.;
    #P newex 81 231 29 9109513 avg~;
    #P user multiSlider 300 440 200 96 0. 75. 12 2937 15 0 0 2 0 0 0;
    #M frgb 249 203 82;
    #M brgb 255 255 255;
    #M rgb2 127 127 127;
    #M rgb3 0 0 0;
    #M rgb4 37 52 91;
    #M rgb5 74 105 182;
    #M rgb6 112 158 18;
    #M rgb7 149 211 110;
    #M rgb8 187 9 201;
    #M rgb9 224 62 37;
    #M rgb10 7 114 128;
    #P newex 145 667 28 9109513 dac~;
    #N sfplay~ 2 120960 0 ;
    #P newobj 62 105 48 9109513 sfplay~ 2;
    #P newex 114 159 157 9109513 fffb~ 12 60 0.5 0.1;
    #P window linecount 2;
    #P comment 233 522 49 9109513 Transform Volume;
    #P window linecount 1;
    #P comment 284 101 100 9109513 transform strength;
    #P window linecount 2;
    #P comment 106 400 67 9109513 Wave Volume;
    #P window linecount 0;
    #P comment 1011 12 100 9109513;
    #P window linecount 1;
    #P comment 12 53 29 9109513 DAC;
    #P comment 45 52 100 9109513 WAV File Control;
    #P comment 299 414 100 9109513 Frequency Response;
    #P comment 248 352 141 9109513 List of frequencies to play;
    #P window linecount 5;
    #P comment 220 18 170 9109513 Michael Chladil 3/29/2007 MAX/MSP programming help from Jeremy Rozenstein Jonathan Lee Marcus;
    #P connect 43 0 10 0;
    #P connect 42 0 10 0;
    #P connect 41 0 10 0;
    #P connect 10 1 60 0;
    #P connect 10 0 60 0;
    #P connect 23 0 13 0;
    #P connect 9 0 13 0;
    #P connect 13 0 14 0;
    #P connect 59 0 60 1;
    #P connect 57 0 14 1;
    #P connect 10 1 9 0;
    #P connect 10 0 9 0;
    #P connect 23 0 16 0;
    #P connect 9 1 16 0;
    #P connect 16 0 17 0;
    #P connect 57 0 17 1;
    #P connect 46 0 11 0;
    #P connect 22 0 11 0;
    #P connect 60 0 11 0;
    #P connect 23 0 18 0;
    #P connect 9 2 18 0;
    #P connect 18 0 19 0;
    #P connect 46 0 11 1;
    #P connect 24 0 23 0;
    #P connect 57 0 19 1;
    #P connect 52 0 50 0;
    #P connect 50 0 44 0;
    #P connect 44 0 45 0;
    #P connect 14 0 15 0;
    #P connect 15 0 47 0;
    #P connect 47 0 49 0;
    #P connect 49 0 51 0;
    #P connect 51 0 52 0;
    #P connect 45 0 46 0;
    #P connect 17 0 15 1;
    #P connect 23 0 20 0;
    #P connect 9 3 20 0;
    #P connect 20 0 21 0;
    #P connect 58 0 46 1;
    #P connect 51 1 50 1;
    #P connect 53 0 52 1;
    #P connect 19 0 15 2;
    #P connect 47 1 48 0;
    #P connect 48 0 49 1;
    #P connect 57 0 21 1;
    #P connect 21 0 15 3;
    #P connect 26 0 15 4;
    #P connect 23 0 25 0;
    #P connect 9 4 25 0;
    #P connect 25 0 26 0;
    #P connect 28 0 15 5;
    #P connect 30 0 15 6;
    #P connect 57 0 26 1;
    #P connect 32 0 15 7;
    #P connect 23 0 27 0;
    #P connect 9 5 27 0;
    #P connect 27 0 28 0;
    #P connect 34 0 15 8;
    #P connect 15 0 12 0;
    #P connect 36 0 15 9;
    #P connect 57 0 28 1;
    #P connect 38 0 15 10;
    #P connect 23 0 29 0;
    #P connect 9 6 29 0;
    #P connect 29 0 30 0;
    #P connect 40 0 15 11;
    #P connect 57 0 30 1;
    #P connect 23 0 31 0;
    #P connect 9 7 31 0;
    #P connect 31 0 32 0;
    #P connect 57 0 32 1;
    #P connect 23 0 33 0;
    #P connect 9 8 33 0;
    #P connect 33 0 34 0;
    #P connect 57 0 34 1;
    #P connect 35 0 36 0;
    #P connect 23 0 35 0;
    #P connect 9 9 35 0;
    #P connect 36 0 56 0;
    #P connect 57 0 36 1;
    #P connect 23 0 37 0;
    #P connect 9 10 37 0;
    #P connect 37 0 38 0;
    #P connect 57 0 38 1;
    #P connect 23 0 39 0;
    #P connect 9 11 39 0;
    #P connect 39 0 40 0;
    #P connect 38 0 55 0;
    #P connect 57 0 40 1;
    #P connect 40 0 54 0;
    #P window clipboard copycount 62;

  • posted by Michael at 8:31 am  

    No Comments »

    No comments yet.

    RSS feed for comments on this post. TrackBack URI

    Leave a comment

    Powered by WordPress