It looks like you're using an Ad Blocker.

Please white-list or disable AboveTopSecret.com in your ad-blocking tool.

Thank you.

 

Some features of ATS will be disabled while you continue to use an ad-blocker.

 

WOW! M.I.T. Researchers can recreate sound from objects in the room

page: 11
33
<< 8  9  10    12  13 >>

log in

join
share:

posted on Aug, 11 2014 @ 12:48 PM
link   
a reply to: neoholographic

It's amazing how you've managed to completely miss the point. 10 pages in and honestly, I don't think there's anything more I or anyone else can say on the matter that has not been rehashed about 100 times.




posted on Aug, 11 2014 @ 12:54 PM
link   
a reply to: GetHyped

Of course there isn't because most of the things you're saying have nothing to do with what I actually said.

You guys continue to make stuff up and then debate against the asinine things you make up.

How many times in this thread have I said show me where I said this and you guys simply ignore it because you can't show me where I said anything remotely close to what you've just spent an entire post debating against.

I have had to do this at least 7 or 8 times in this thread.

That's just being dishonest because you can't debate against the things I actually said so you just make it up as you go.
edit on 11-8-2014 by neoholographic because: (no reason given)



posted on Aug, 11 2014 @ 01:03 PM
link   

originally posted by: ZetaRediculian
a reply to: neoholographic

I am trying to understand what you ARE saying which quite difficult.

I think your position is that this process is somehow immune to noise. Can you explain what you are actually trying to say without all the usual flamboyancy?

you just seem to be going around in circles. First there is no unwanted vibrations, then the paper is shown, then the paper has nothing to do with anything, then you quote the paper, then you say there is no disturbance in the air, then you are shown the graphic from the paper, then there is disturbance in the air.

so can you very calmly say what you mean?


Neo --

Do what Zeta is asking...Please. That's what I asked from you a few posts back when I laid out my understanding about this "video microphone" in the numbered points. I wanted to make sure we were both discussing the same issue. However, you haven't responded to my post, so I still can't tell if we are. You keep posting other information that I didn't ask for.

Please, just clearly and succinctly spell out in a few simple bullet points what it is you feel these MIT engineers did, how it works in concept, and what the implications and applications are of this technology.

That way, we would then be able to talk about the same thing. Right now, I'm not sure what your argument is.


edit on 8/11/2014 by Soylent Green Is People because: (no reason given)



posted on Aug, 11 2014 @ 01:08 PM
link   



posted on Aug, 11 2014 @ 01:12 PM
link   
a reply to: Soylent Green Is People

Now we're back to square one.

I have asked and answered these things over and over again.

You guys then make up things that I never said and then when I knock those down you resort back to, just explain CLEARLY AND SUCCINCTLY ABOUT THIS TECHNOLOGY.

LOL, I have over and over again and the technology is simple to understand. I have went over things like fps, the algorithm they use, how they recreate the sound through visual data and all sorts of different aspects of this technology ad nauseam.

It's actually simple and straightforward and there's no need to go through these things again because all of the nonsense you and others have spouted has been shown to be FLATLY DISHONEST and now you back to square one which is just more nonsense.

So go back and read the posts in the thread. If you have any new questions that don't debate against things I never said then we can continue to debate.



posted on Aug, 11 2014 @ 01:21 PM
link   

originally posted by: neoholographic

It's actually simple and straightforward and there's no need to go through these things again because all of the nonsense you and others have spouted has been shown to be FLATLY DISHONEST and now you back to square one which is just more nonsense.

So go back and read the posts in the thread. If you have any new questions that don't debate against things I never said then we can continue to debate.


The problem is that too much has been said, and I can't separate the wheat from the chaff. I truly no longer know what you are trying to say.

So I'll help out and repeat the numbered points I wrote before (Mods -- please forgive me for repeating this information). All you need to do is agree or disagree, and if you disagree, please succinctly tell me why.

Here's how I see this technology working:

1. A speaker creates a sound.

2. The sound creates pressure waves through the air (this is how sound travels through air).

3. The chip bag (or "crisp bag" for you non-Americans) is affected by the pressure waves.

4. the chip bag begins to vibrate due to those pressure waves, much like a microphone or speaker diaphragm.

5. The video camera looks at those vibrations, and then analyzes the frequency and amplitude of the vibrations of the bag.

In a traditional microphone, a magnet and electrical coil reads the movement of the microphone and translates that movement into electrical signals. In this method, the camera reads the movement, and the software converts that movement into the same type of electrical signals as the traditional microphone method.

At this point, those electrical signals could be played back again the recreate the sound, just like with traditional sound playback equipment.


This seems similar to the spy method of bouncing a laser off of a window to hear conversations inside the room on the other side of that Window. Using the laser/window method, the laser reads the frequency and amplitude of the window vibrations, similar (albeit different) than the way the video camera and software can read the chip bag vibrations.

Do I have this right? Is that also your understanding of how this "recording sound using a video camera" works?

If not, what is your understanding about how this works?



edit on 8/11/2014 by Soylent Green Is People because: (no reason given)



posted on Aug, 11 2014 @ 01:35 PM
link   

originally posted by: neoholographic

Even with a camera with 60 fps, they were able to identify the gender of the speaker, the number of speakers and give accurate information about the acoustic properties of the speakers voices.



But what they couldn't do, according to the quote you referred us to, is establish what was being said. They could only establish that something was being said.

A microphone would be able to "recreate all the sound in the room", and it would be able to do it to a sufficient level of quality that you could most likely distinguish between background radio and the sound of people talking, providing the amount of noise remained within a certain range. A microphone, of course, is custom-built to do this and has many decades of practical experience, technology, and material sciences supporting it.

The bag is just a bag. The plant is just a plant. It doesn't matter if the equipment can detect movements "less than 100th of a pixel" because (i) the size of the movement is only part of the information needed to recreate sound, and (ii) in this case, it still isn't sensitive enough to pick up all the information needed, unless the information is very simple (no other sources of sound in the room and you don't need crystal-clear audio).

It goes without saying that trying to compare the data collected from a microphone to the data collected through this experiment, would be like comparing Da Vinci's Vitruvian Man to a four-year-old's stick figure drawing.

I look forward with interest to your next caps-lock-fuelled rant.



posted on Aug, 11 2014 @ 01:37 PM
link   

originally posted by: peck420


Everyone knows that meat is tastier if you tenderise it first!



posted on Aug, 11 2014 @ 01:47 PM
link   
a reply to: EvillerBob

Yes. The quality of the sound being captured is limited by the quality of the microphone (in this case, the microphone is a plant and a bag)

A regular traditional magnetic microphone can capture tiny movements (fast and invisible to the naked eye) in the microphone diaphragm -- probably as tiny and fast as the video camera can pick up. However, if I switched the traditional microphone diaphragm with a plant or a plastic bag, the magnetic pick-ups would still be able to detect vibrations, but the plant and bag itself would not be producing vibrations that are faithful to the sound being detected -- and the system would not be able to reproduce the sound as faithfully.

That's why I'm confused when Neo talks about video frame rates and 100th of a pixel movement. It's not like the video camera is doing a BETTER job of detecting the diaphragm vibrations (i.e., plant and bag diaphragm) than a magnetic microphone pick-up can.

In fact, this system is ripe for distortion due to the limitations of that plant/bag diaphragm. I would think a background noise would more easily mask the sound of a voice using a plant as a microphone.


edit on 8/11/2014 by Soylent Green Is People because: (no reason given)



posted on Aug, 11 2014 @ 02:15 PM
link   
a reply to: Soylent Green Is People

You start to go south at number 4 of your points.

4. the chip bag begins to vibrate due to those pressure waves, much like a microphone or speaker diaphragm.

5. The video camera looks at those vibrations, and then analyzes the frequency and amplitude of the vibrations of the bag.


The video camera doesn't look at those vibrations and then analyzes the frequency and amplitude of the vibrations.

The algorithm looks at the visual data and picks up vibrations that are less than 100th of a pixel and then recreates the sound from the room.

This is why the Researchers did the experiment through soundproof glass.


Researchers at MIT, Microsoft, and Adobe have developed an algorithm that can reconstruct an audio signal by analyzing minute vibrations of objects depicted in video. In one set of experiments, they were able to recover intelligible speech from the vibrations of a potato-chip bag photographed from 15 feet away through soundproof glass.

In other experiments, they extracted useful audio signals from videos of aluminum foil, the surface of a glass of water, and even the leaves of a potted plant. The researchers will present their findings in a paper at this year’s Siggraph, the premier computer graphics conference.

“When sound hits an object, it causes the object to vibrate,” says Abe Davis, a graduate student in electrical engineering and computer science at MIT and first author on the new paper. “The motion of this vibration creates a very subtle visual signal that’s usually invisible to the naked eye. People didn’t realize that this information was there.”


Again, there isn't a one to one correspondents between the video camera and the vibrations that the sound creates when it comes to recreated sound from the room.

This is created through visual data that's usually invisible from the naked eye. So you need the algorithm to recreate the sound from the room which captures these vibrations that are less than 100th of a pixel.

This is why the Researchers also talk about frames per second and they even picked up these visual signal at 60 fps even though it wasn't as crisp as say a camera at 5,000 fps.

They capture the image in frame and then run these frames through an algorithm.

It's different than a microphone because you're talking about visual data at 100th of a pixel that's being used to recreate the sound.

This is why when you guys talk about disturbances in the air creating sounds and vibrations that mask the recreation of sound, it just doesn't apply.

The thing that's most important for this technology is frames per second. This is why you read this:


Reconstructing audio from video requires that the frequency of the video samples — the number of frames of video captured per second — be higher than the frequency of the audio signal. In some of their experiments, the researchers used a high-speed camera that captured 2,000 to 6,000 frames per second. That’s much faster than the 60 frames per second possible with some smartphones, but well below the frame rates of the best commercial high-speed cameras, which can top 100,000 frames per second.


RECONSTRUCTING AUDIO FROM VIDEO REQUIRES THAT THE FREQUENCY OF THE THE VIDEO SAMPLES - THE NUMBER OF FRAMES PER SECOND - BE HIGHER THAN THE FREQUENCY OF THE AUDIO SIGNAL.

They did this at 60 fps and with cameras that are over 5,000 fps. It then says:


That technique passes successive frames of video through a battery of image filters, which are used to measure fluctuations, such as the changing color values at boundaries, at several different orientations — say, horizontal, vertical, and diagonal — and several different scales.


This is why I keep asking how can disturbances from the air create vibrations that mask the recreation of sound if we're not talking about Contact Vibrations and even then they never said they couldn't recreate sound.



posted on Aug, 11 2014 @ 02:19 PM
link   
a reply to: EvillerBob

What???


But what they couldn't do, according to the quote you referred us to, is establish what was being said. They could only establish that something was being said.


You do know they couldn't understand what was being said because they were trying the experiment with a camera that was 60 frames per second???

When they used a better camera with faster frames per second they they could understand what's being said.


Researchers at MIT, Microsoft, and Adobe have developed an algorithm that can reconstruct an audio signal by analyzing minute vibrations of objects depicted in video. In one set of experiments, they were able to recover intelligible speech from the vibrations of a potato-chip bag photographed from 15 feet away through soundproof glass.

Reconstructing audio from video requires that the frequency of the video samples — the number of frames of video captured per second — be higher than the frequency of the audio signal. In some of their experiments, the researchers used a high-speed camera that captured 2,000 to 6,000 frames per second. That’s much faster than the 60 frames per second possible with some smartphones, but well below the frame rates of the best commercial high-speed cameras, which can top 100,000 frames per second.


They did these experiments with cameras that have different frames per second.

You do know the difference?



posted on Aug, 11 2014 @ 02:31 PM
link   

originally posted by: neoholographic
a reply to: Soylent Green Is People

You start to go south at number 4 of your points.

4. the chip bag begins to vibrate due to those pressure waves, much like a microphone or speaker diaphragm.

5. The video camera looks at those vibrations, and then analyzes the frequency and amplitude of the vibrations of the bag.


The video camera doesn't look at those vibrations and then analyzes the frequency and amplitude of the vibrations.

The algorithm looks at the visual data and picks up vibrations that are less than 100th of a pixel and then recreates the sound from the room.

This is why the Researchers did the experiment through soundproof glass.


Researchers at MIT, Microsoft, and Adobe have developed an algorithm that can reconstruct an audio signal by analyzing minute vibrations of objects depicted in video. In one set of experiments, they were able to recover intelligible speech from the vibrations of a potato-chip bag photographed from 15 feet away through soundproof glass.

In other experiments, they extracted useful audio signals from videos of aluminum foil, the surface of a glass of water, and even the leaves of a potted plant. The researchers will present their findings in a paper at this year’s Siggraph, the premier computer graphics conference.

“When sound hits an object, it causes the object to vibrate,” says Abe Davis, a graduate student in electrical engineering and computer science at MIT and first author on the new paper. “The motion of this vibration creates a very subtle visual signal that’s usually invisible to the naked eye. People didn’t realize that this information was there.”


Again, there isn't a one to one correspondents between the video camera and the vibrations that the sound creates when it comes to recreated sound from the room.

This is created through visual data that's usually invisible from the naked eye. So you need the algorithm to recreate the sound from the room which captures these vibrations that are less than 100th of a pixel.

This is why the Researchers also talk about frames per second and they even picked up these visual signal at 60 fps even though it wasn't as crisp as say a camera at 5,000 fps.

They capture the image in frame and then run these frames through an algorithm.

It's different than a microphone because you're talking about visual data at 100th of a pixel that's being used to recreate the sound.

This is why when you guys talk about disturbances in the air creating sounds and vibrations that mask the recreation of sound, it just doesn't apply.

The thing that's most important for this technology is frames per second. This is why you read this:


Reconstructing audio from video requires that the frequency of the video samples — the number of frames of video captured per second — be higher than the frequency of the audio signal. In some of their experiments, the researchers used a high-speed camera that captured 2,000 to 6,000 frames per second. That’s much faster than the 60 frames per second possible with some smartphones, but well below the frame rates of the best commercial high-speed cameras, which can top 100,000 frames per second.


RECONSTRUCTING AUDIO FROM VIDEO REQUIRES THAT THE FREQUENCY OF THE THE VIDEO SAMPLES - THE NUMBER OF FRAMES PER SECOND - BE HIGHER THAN THE FREQUENCY OF THE AUDIO SIGNAL.

They did this at 60 fps and with cameras that are over 5,000 fps. It then says:


That technique passes successive frames of video through a battery of image filters, which are used to measure fluctuations, such as the changing color values at boundaries, at several different orientations — say, horizontal, vertical, and diagonal — and several different scales.


This is why I keep asking how can disturbances from the air create vibrations that mask the recreation of sound if we're not talking about Contact Vibrations and even then they never said they couldn't recreate sound.

I've tried hard to refrain from replying to this thread after I bowed out, but holy cow! This is seriously way over your head, and I can only hope the other posters stop wasting their time explaining in detail the principles you are having trouble understanding.

Over and over you make an argument, then demonstrate that you don't understand what is really happening to the bag, plant, etc. to make it exhibit "vibrations of less that 100th of a pixel", which is a phrase you seem to think somehow supports your skewed idea of the mechanics taking place. In fact, the "less than 100th of a pixel" only indicates that this is a very sensitive procedure, and is easily influenced by any source of sound waves besides the intended audio.



posted on Aug, 11 2014 @ 02:40 PM
link   

originally posted by: neoholographic
You start to go south at number 4 of your points.

4. the chip bag begins to vibrate due to those pressure waves, much like a microphone or speaker diaphragm.

5. The video camera looks at those vibrations, and then analyzes the frequency and amplitude of the vibrations of the bag.


The video camera doesn't look at those vibrations and then analyzes the frequency and amplitude of the vibrations.

The algorithm looks at the visual data and picks up vibrations that are less than 100th of a pixel and then recreates the sound from the room.



The two sentences you wrote that I highlighted in orange are two conflicting statements. In the first sentence you said it does NOT analyze the vibrations, then in the next sentence you said it does (i.e., it "looks at the visual data and picks up vibrations that are less than 100th of a pixel"). looking at the pixels to detect vibration IS analyzing the amplitude and frequency of the vibrations.

What I mean is, it detects how much (amplitude) and how fast (frequency) the plant or bag is moving by looking at it.

All vibrations have an amplitude and a frequency. To recreate the sound, this system would NEED to measure the amplitude and frequency of the vibrations -- even the tiny and fast vibrations. The camera and software is what is detecting that frequency and amplitude of vibration. If it didn't, then this would not work.


Therefore, The video camera IS looking at the frequency and amplitude of the tiny (invisible to the eye) vibrations of the plant and chip bag. The vibrations are what what would be causing sound if the bag and plant were a traditional microphone.

The only difference between this video camera "microphone" and a traditional microphone is what is being used to detect those tiny vibrations in the diaphragm -- whether than diaphragm be the traditional kind found in a regular microphone, or that diaphragm is a plastic bag.

The "pick up" device in a traditional microphone is a magnet and electrical coil. The vibrations of the traditional microphone diaphragm makes the magnet move, and those tiny movements of that magnet (even some movements faster and smaller than the eye can see) are detected by the electrical coil as varying electrical impulses that correspond to the frequency and amplitude that the diaphragm was vibrating.

This video camera "microphone" is extremely similar in concept. The chip bag or the plant is the diaphragm. However, instead of a magnet and electrical coil detecting the frequency and amplitude of the bag vibrations, the video camera is doing that instead. The software then converts the "visual data" that is the amplitude and frequency of the bag vibration, and then converts it into electrical impulses corresponding to that frequency and amplitude -- just like a traditional microphone.

....And that's how this works. As I said, it is very clever, but it is still just another method of detecting the frequency and amplitude of vibrations in a microphone diaphragm. That diaphragm just happens to be a bag or a plant, and the vibrations are being detected visually instead of electromagnetically.

The laser method of detecting the frequency and amplitude of vibrations (used by spies on windows) is similar to this video camera method -- i.e., it is still doing the basic job of detecting vibrations of a diaphragm, but it uses a laser measuring system, while this video method uses a visual measuring system (measuring changes in pixels).


edit on 8/11/2014 by Soylent Green Is People because: (no reason given)



posted on Aug, 11 2014 @ 02:49 PM
link   

originally posted by: neoholographic
a reply to: Soylent Green Is People


This is where your misunderstanding stems from so let's clarify:

1) The AIR PRESSURE WAVES are causing the bag to vibrate.

2) The reason they need a high speed camera is to capture the vibrations caused by the AIR PRESSURE WAVES which lie within the human hearing range, which ranges form 20hz (20 oscillations per second) to 20,000hz (20 thousand oscillations per second).

3) To capture a signal digitally, you need to have a sampling rate of twice the upper band frequency you wish to record.

4) Recording at 5,000 FPS will allow you to record up to 2.5khz (2.5 thousand oscillations per second), which is well within the hearing range of human beings (and also within the range of intelligible speech). The speed of the camera is nothing more than a function of the sampling rate.

5) Again, there is nothing magical going on here. It is the AIR PRESSURE WAVES that are causing the bag to vibrate at these frequencies.



This is why I keep asking how can disturbances from the air create vibrations that mask the recreation of sound if we're not talking about Contact Vibrations and even then they never said they couldn't recreate sound.


Contact vibrations are nothing more than a different medium that the sound is propagating through. The other one is the AIR, which is causing the bag to vibrate in the first place. Adding more unwanted "disturbances from the air" will mess up the recreation of the sound for reasons discussion many times over.

From the paper:


An input sound (the signal we want to recover) consists of fluctuations in air pressure at the surface of some object. These fluctuations cause the object to move, resulting in a pattern of displacement over time that we film with a camera. We then process the recorded video with our algorithm to recover an output sound.

edit on 11-8-2014 by GetHyped because: (no reason given)



posted on Aug, 11 2014 @ 04:30 PM
link   
a reply to: neoholographic

I'm not understanding what you are going on about. This is some pretty basic stuff. There is still nothing in any of your rants that shows this stuff is anything more than what has been shown in the video. They show that they were able to extract a single sound from objects in a controlled environment. They did not show that they could extract sound in cases where there was unwanted noise or other sounds. They did not show that they could extract multiple sounds. What they have shown and documented is not what you are thinking they have shown. There really is nothing more to this than what is there. Sound is sound.



posted on Aug, 11 2014 @ 04:32 PM
link   
a reply to: Soylent Green Is People

No, there not conflicting statements.

In one statement I said the VIDEO CAMERA doesn't analyze the vibrations.

In the next sentence I said:

The ALGORITHM analyzes the vibrations and recreates the sound from the room.

The video camera doesn't ANALYZE THE PIXELS TO DETECT VIBRATIONS.

The video camera simply captures these vibrations and the technology ANALYZES these vibrations and recreates the sound coming from the room.

The video camera doesn't analyze anything, the technology does through the algorithm.

If you can show me in the paper where it says the video camera analyzes these vibrations, I would like to see it.

You said:


What I mean is, it detects how much (amplitude) and how fast (frequency) the plant or bag is moving by looking at it.


THIS ISN'T WHAT YOU INITIALLY SAID:

5. The video camera looks at those vibrations, and then analyzes the frequency and amplitude of the vibrations of the bag.

No it doesn't. I agree the camera detects these vibrations to be analyzed by the technology and then they recreate sound.

This isn't what you initially said and I suspect you realized your era and used DETECT but the sad thing for you is I SAW YOUR ERA TOO LOL.

Now let's go onto your other contention.

This video camera "microphone" is extremely similar in concept.

When did I ever say they were not similar concepts?

Again, I ask WHAT ARE YOU DEBATING? WHY ARE YOU DEBATING AGAINST SOMETHING I NEVER SAID??

I said this can be more effective than a microphone and this is why their looking at applications in law enforcement and forensics.

I never said they were not similar concepts.



YOU GUYS HAVE A BAD HABIT OF DEBATING THINGS THAT I NEVER SAID.

The fact is with a microphone the sound needs to be picked up by the microphone. Here's a simple example on how a microphone works.



With a microphone sound is picked up by the mic. With this technology you will have more flexibility because you can use different objects throughout the room to use to recreate sounds from the room and I'm sure this is why applications in law enforcement and forensics are being looked at.

Let's look at a case of John Gotti.

John Gotti would talk to his people outside of their hangout because they wanted to avoid their conversation being picked up by a microphone. Here's a video:



Now unless you have Gotti or the persons he's talking to mic'd up, then you can't hear what they're saying.

With this technology, the cloths they're wearing, their bodies, the trees and walls around them become microphones.

This is why they will look at this technology for applications in law enforcement and forensics. It will be like putting a mic on criminals without turning someone to where a mic.

Like I said, it's technology that can be very useful and you guys need to try and understand what your talking about.
edit on 11-8-2014 by neoholographic because: (no reason given)



posted on Aug, 11 2014 @ 04:52 PM
link   

originally posted by: neoholographic
With this technology, the cloths they're wearing, their bodies, the trees and walls around them become microphones.

The most prominent noise, that you would pick up with this technology, based on the videos, will be traffic.

The vibrations caused by 3000 lbs+ objects, moving at high speeds (relatively), will be far more powerful than any vibration caused by the sound of a human voice.

I have built multiple buildings that require high levels of harmonic control...and traffic is an absolute killer. Don't even get me started on trains and public transit.



posted on Aug, 11 2014 @ 05:03 PM
link   
a reply to: peck420

Again, you have no idea what you're talking about. You said:

The vibrations caused by 3000 lbs+ objects, moving at high speeds (relatively), will be far more powerful than any vibration caused by the sound of a human voice.

Who said the vibrations caused by a human voice are different than the vibrations of a car driving by??

The sound is recreated as a whole so there wouldn't be vibrations coming from a car passing by that's any different than the vibrations being created by a human voice. The same way a mic will pick up the conversation with cars passing by if you had a criminal wired, is the same way this technology will pick up what's being said.


The researchers developed an algorithm that combines the output of the filters to infer the motions of an object as a whole when it’s struck by sound waves. Different edges of the object may be moving in different directions, so the algorithm first aligns all the measurements so that they won’t cancel each other out. And it gives greater weight to measurements made at very distinct edges — clear boundaries between different color values.


So they're recreating the sound similar to a microphone but using visual data. This will give them an advantage and the algorithm in the future may even be able to distinguish between a human voice and say a horn blowing in the background and then have the algorithm just play back the sounds of the human voice that's being recreated.

So again, you have to actually try and read about the technology and you will know why they will look for applications in law enforcement and forensics through this technology.



posted on Aug, 11 2014 @ 05:13 PM
link   
a reply to: neoholographic


With this technology, the cloths they're wearing, their bodies, the trees and walls around them become microphones.

it would have to be shown that it works like this. None of the experiments were done outside or with multiple sounds. If you think it works and that its ready, thats fine but you may never hear of it again because of the obvious issues. Their best was 15 feet, one voice. Maybe invest or start a company.



posted on Aug, 11 2014 @ 05:15 PM
link   
a reply to: neoholographic

Yes, we know the video camera is capturing the vibrations as images.

Yes, we know that the algorithm is processing these vibrations in the spatial domain and computing an audio signal in the time domain.

Yet you're still making fundamental misconceptions about the the technology and underlying principles and have the hubris to imply that everyone else has the misunderstanding here.

You said:


The sound is recreated as a whole so there wouldn't be vibrations coming from a car passing by that's any different than the vibrations being created by a human voice. The same way a mic will pick up the conversation with cars passing by if you had a criminal wired, is the same way this technology will pick up what's being said.


Is still fundamentally wrong for reasons specified countless times. Myself and others have explained exactly where your misconceptions lie over and over again. It's getting very silly now.

You said:

So they're recreating the sound similar to a microphone but using visual data. This will give them an advantage and the algorithm in the future may even be able to distinguish between a human voice and say a horn blowing in the background and then have the algorithm just play back the sounds of the human voice that's being recreated.


Which demonstrates nothing more than your fundamental misconceptions of not just this technology but the basic physics and acoustic principles behind it.




top topics



 
33
<< 8  9  10    12  13 >>

log in

join