WOW! M.I.T. Researchers can recreate sound from objects in the room

page: 13
32
<< 10  11  12   >>

log in

join

posted on Aug, 14 2014 @ 01:52 PM
link   
a reply to: GetHyped
You mean like the way blind people use echo location to build a mental picture of their environment?


Oh yeah & Radar!
edit on 14-8-2014 by funkadeliaaaa because: (no reason given)




posted on Aug, 14 2014 @ 01:59 PM
link   
a reply to: funkadeliaaaa

Yes and no. The whole echo sounding some blind people can do (and all of us can do to some extent) is more akin to sonar, that is getting an impression of the environment by listening to the acoustic response of that environment. Kind of how, even if blindfolded, you'd be able to tell which room in your house you're in by speaking or clapping your hands because your voice/clap would sound "different".

However, a spectogram is nothing like echo sounding, but it is but one of many ways of visualizing audio. In this particularity instance it is transforming it from the time domain (the amplitude of the wave) into the frequency domain (a breakdown of the frequency content, from low to hi).

Like this:




posted on Aug, 15 2014 @ 01:22 PM
link   
a reply to: GetHyped

Wrong yet again. You guys need to stop blathering on about things that have nothing to do with the technology from the Researchers. If you notice, you guys never quote from the article or the paper about the technology to back up your claims. You just ramble on and on and it has nothing to do with the technology. You said:


Again, you are failing to grasp the technology here. The technology is analyzing vibrations in the spatial domain and converting them to vibrations in the time domain. The speed of the camera is nothing more than a function of the Nyquist limit. They even state that upper frequencies are problematic and result in a noisier signal hence faster FPS does not necessarily equate to a better quality signal. With all this in mind, masking and maintaining a good SNR is the same problem as any other recording technology. Again, we're just repeating the same thing to you over and over and you continue to demonstrate your failure to grasp the concepts behind this technology.


The speed of the camera is paramount with this technology and it's because the algorithm is able to get more detail as they get a camera with faster frames per second. This allows the algorithm to capture minute vibrations through successive frames. They explain this in very simple terms and they even tell you and show you the difference between cameras at different frames per second.


Reconstructing audio from video requires that the frequency of the video samples — the number of frames of video captured per second — be higher than the frequency of the audio signal. In some of their experiments, the researchers used a high-speed camera that captured 2,000 to 6,000 frames per second. That’s much faster than the 60 frames per second possible with some smartphones, but well below the frame rates of the best commercial high-speed cameras, which can top 100,000 frames per second.


It goes on to show you that they even managed to pick up sound to a lesser degree with a camera at 60 fps because of high frequency vibrations. When they used a camera with higher fps they picked up better sound.


In other experiments, however, they used an ordinary digital camera. Because of a quirk in the design of most cameras’ sensors, the researchers were able to infer information about high-frequency vibrations even from video recorded at a standard 60 frames per second. While this audio reconstruction wasn’t as faithful as that with the high-speed camera, it may still be good enough to identify the gender of a speaker in a room; the number of speakers; and even, given accurate enough information about the acoustic properties of speakers’ voices, their identities.


Even then they were able to pick up details about the speakers in the room even through high frequency vibrations.

All I ask is that instead of just blathering on about things that have nothing to do with what we're talking about that you actually quote from the article or the paper where your assertions are supported. We're getting a lot of nonsense that has nothing to do with what we're talking about so again, PLEASE SUPPORT WHAT YOU'RE SAYING THROUGH THE ARTICLE OR PDF!!



posted on Aug, 15 2014 @ 01:46 PM
link   
a reply to: neoholographic

Dude, give it up. I've read the paper. I grasp the concepts. I've described exactly what's going on. You persist in misunderstanding not just the prerequisite background concepts but also the details of the paper. Enough of the Dunning Kruger act already.



posted on Aug, 15 2014 @ 02:01 PM
link   
a reply to: neoholographic
What are you still going on about? All you have to do is demonstrate this. You are trying to debate something that can easily be shown. There are a number of experiments that have been documented. None of which show the things you are talking about. For some reason the experiments that took place outside with multiple people talking and other noises around are not found anywhere. Why? They will be making their software available soon for FREE. Why not just do it yourself? You don't have to code anything because its already done. Its software. Its no different than any other software you use every day. It works or it doesn't. Its like debating if a car can go 0 to 60 in 4 seconds and you are going on and on about it. Get in the car and Go for it! Your debate is absolutely meaningless.
edit on 15-8-2014 by ZetaRediculian because: (no reason given)



posted on Aug, 15 2014 @ 02:21 PM
link   

originally posted by: GetHyped
a reply to: neoholographic

Dude, give it up. I've read the paper. I grasp the concepts. I've described exactly what's going on. You persist in misunderstanding not just the prerequisite background concepts but also the details of the paper. Enough of the Dunning Kruger act already.


In other words, you're just blathering on and you don't know what you're talking about. If you did, it would easy to back it up like I do. Everything I said, is backed up by the researchers and I make sure I quote about the technology and the way the algorithm works.

You and others just keep repeating nonsense on top of nonsense that has nothing to do with what we're talking about.
edit on 15-8-2014 by neoholographic because: (no reason given)



posted on Aug, 15 2014 @ 02:29 PM
link   
a reply to: neoholographic


Everything I said, is backed up by the researchers

no its not.



posted on Aug, 15 2014 @ 03:59 PM
link   

originally posted by: neoholographic
Everything I said, is backed up by the researchers and I make sure I quote about the technology and the way the algorithm works.


Yeah, but here's the thing:

Other sources besides the target voice would impart vibrations on the chip bag or plant. Those other sources would impart vibrations that are 100th of a pixel -- just like the voice vibrations. Those other sources would impart vibrations that are only visible with the high speed camera -- just like the voice vibrations...

...basically, in a real world environment, other sources of vibration would become part of/intermingled with the target voice vibration, perhaps making it very difficult to hear the voice over the sound produced by the other vibrations.

The article doesn't say anything about voice vibrations being unique enough to be able to single out those vibrations being impart on the chip bag by the target voice from other sources of vibrations being imparted on that chip bag -- if this was done in a real-world setting where other vibrations abound.

edit on 8/15/2014 by Soylent Green Is People because: (no reason given)



posted on Aug, 15 2014 @ 05:07 PM
link   
a reply to: Soylent Green Is People

What??

What other vibrations are you talking about? When you say target voice and vibrations unique to the target voice, WHAT IN THE WORLD ARE YOU TALKING ABOUT?

WHO SAID ANYTHING ABOUT VIBRATIONS UNIQUE TO A TARGET VOICE??? WHAT ARE VIBRATIONS FROM A TARGET VOICE????

All of the sound around an object causes a vibration to occur and these small vibrations recreate the sound through visual data.

This is why I keep asking you and others to stop blathering on and on and actually quote from the article or PDF the parts to support your silly notions.

The vibrations that are less than 100th of a pixel are picked up through visual data.


“When sound hits an object, it causes the object to vibrate,” says Abe Davis, a graduate student in electrical engineering and computer science at MIT and first author on the new paper. “The motion of this vibration creates a very subtle visual signal that’s usually invisible to the naked eye. People didn’t realize that this information was there.”


Nobody is talking about unique voice vibrations. Where do you get this nonsense?? When people are talking and say a car passes by, all of things cause the object to vibrate. This is sound that hits the object. The motion of this vibration creates a very subtle visual signal less than 100th of a pixel. This is why they have to first capture this signal through successive frames.


That technique passes successive frames of video through a battery of image filters, which are used to measure fluctuations, such as the changing color values at boundaries, at several different orientations — say, horizontal, vertical, and diagonal — and several different scales.

The researchers developed an algorithm that combines the output of the filters to infer the motions of an object as a whole when it’s struck by sound waves. Different edges of the object may be moving in different directions, so the algorithm first aligns all the measurements so that they won’t cancel each other out. And it gives greater weight to measurements made at very distinct edges — clear boundaries between different color values.


First they run the successive images through an image filter. This is why you get better quality when you go from 60 to say 5,000 fps. This magnifies the visual data such as changing colors at the boundaries, different orientations and different scales.

Why is this important? The algorithm then picks up these distinctions and recreates sound.

THERE ARE NO UNIQUE VOICE VIBRATIONS! WHO HAS TALKED ABOUT VOICE VIBRATIONS BEING UNIQUE ENOUGH???

You said:


Other sources besides the target voice would impart vibrations on the chip bag or plant. Those other sources would impart vibrations that are 100th of a pixel -- just like the voice vibrations. Those other sources would impart vibrations that are only visible with the high speed camera -- just like the voice vibrations...


What do you mean target voice? Of course other SOUNDS would cause the chips to vibrate. Who said that they wouldn't?



posted on Aug, 15 2014 @ 06:17 PM
link   

originally posted by: neoholographic
a reply to: Soylent Green Is People
What do you mean target voice? Of course other SOUNDS would cause the chips to vibrate. Who said that they wouldn't?


I feel like we're making progress!

So... if a radio is playing loudly in the same room, this will also cause vibrations in the bag of chips.

And... because the loud radio causes stronger vibrations, it can effectively "drown out" or "mask" the vibrations caused by someone talking.

Which means... this new technique can be rendered nearly useless in exactly the same way as a traditional microphone - just create too much noise.

The point that everyone in this thread is trying to make, is this: yes it's an interesting technique, but it only works in ideal conditions. Changing those conditions slightly (ie, playing music while you talk) can be enough to stop it working effectively.

I'm not sure what it says about a thread (or the poor deluded souls who are clinging to it so doggedly) when every page is basically a repetition of exactly the same points as the ten pages that came before it. It's been doing wonders for the "star count", mind you. Not exactly sure what the stars are supposed to do other than show agreement, to be honest, but I've reached the stage of being grateful for any positives that I can possibly find in this thread.





top topics
 
32
<< 10  11  12   >>

log in

join