Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is super impressive and something that I didn't think would be possible without someone very skilled in photoshop going over the images.

As a photo enthusiast, I am very excited about this, but also a little worried that soon very simple apps are capable of doing the craziest of edits through the power of neural nets. Imagine the next 'deep beauty transfer', able to copy perfect skin from a model onto everyone, making everything a little more fake and less genuine.

The engineer in me now wants to understand how to build something like this from scratch but I think I'm probably lacking the math skills necessary.



Here is an (in progress) article I am working on that might help you: https://harishnarayanan.org/writing/artistic-style-transfer/

Repository of explanatory notebooks: https://github.com/hnarayanan/artistic-style-transfer


This is excellent blogpost, thanks!

Edit: most comprehensive, best illustration, so far I have seen on this topic.


I think you need a little more information on the Gram matrix (maybe even ditch some of the elementary "this is gradient descent" stuff and just assume the reader knows a bit more about convnets so that you can dive deeper into the style transfer specifics -- there are plenty of other sources that cover the former already).


Coming soon! This got way more popular than I expected well before it was finished.


You need to make a correction. The map isnt linear with the addition of the b vector. Its an offset linear map.


I predict cameras with some built in style transfer filters.

This specific blog post made me realize how great some of the pictures look after the edits on colors.



Yes thank you for the proper name!


Oh sweet, thanks! I'm definitely bookmarking this


This is awesome, thanks!


> I didn't think would be possible without someone very skilled in photoshop going over the images

And this is similar how deep learning will likely erode the need for programming (IMHO).

Deep learning won't necessarily write programs (any more than this AI manipulates images via photoshop). Folks that say, don't worry, we can't write programs easily with AI are missing the vector. Writing programs isn't necessary for there to be widespread disruption.

Most programming is essentially hooking up I/O (of which UIs are a subset) to APIs, data stores and data manipulation. The "goal" of programming is not the code, but the functionality it provides.

AI's don't need to learn to code any more than they need to learn to use photoshop. They need to learn to provide functionality (or in this case manipulate image data).


> "AI's don't need to learn to code any more than they need to learn to use photoshop. They need to learn to provide functionality (or in this case manipulate image data)."

This is interesting. My counterpoint would be that if you rely on AI over programs you lose human-editability and determinism. So fixing a bug or adding a new feature might mean diving into some opaque model rather than adding a few lines of code. You couldn't do anything where consistency is important, like security, manipulating a database with important information, or GUI design. I think that at least protects large swaths of software development.

Even this example seems less like a replacement for Photoshop and more like a cool new feature Photoshop could add


In the real world we rely on humans for lots of stuff, and humans aren't actually deterministic either. Sure, if you train someone to perform a task they'll probably do a good job, or they might suddenly come into work distracted and cause a problem. Diagnosing problems with people is often similarly hard, and we've had all of civilization to work on it.

This hasn't caused the sky to fall, yet. So, perhaps we'll just learn to make AI behave properly under most circumstances, and deal with failures and glitches as we always have with people.


_Coding_ is more about understanding what your boss/clients want and turning it into something more concrete, so it's merely a NLP problem. This will, I think, see adoption in "app/website builder" tools like Squarespace.

Then there is real programming, which IMHO will get automated in the far future.


Its a NLP problem if your boss/client wants to talk with you. At the end of the day, they don't really "want" to TALK WITH YOU, they want the functionality they get as a result of talking with you.

If there are other ways for them to efficiently get the functionality, they are good with that (as much as they might like you).

Similar to you wanting a pizza. You could call and talk with someone (which you don't really want, it was a necessary step), or you fill out the right form/app. Either way, you want the result, not the process.

Your boss/client wants the result of your work, not the work process required to get it necessarily.

It seems likely to me that modern deep learning enabled tools will make it easier for your boss/client to get the result they want directly.

Deep learning + more graphically oriented data flow UIs seems like it will heavily erode the need for traditional programming as users will be able to more directly achieve the functionality they are looking for.


The are many scenarios where we still need provable code/security, i.e. health and safety related matters.

Not sure I would fully entrust a trained AI to control even an elevator door, where failure could result in bodily harm.


The planning an AI could take over from existing elevator controllers already uses constrained access to the motors. Nothing about AI demands stupid system design.


No, but AIs will override all elevator scheduling code. It just need to keep tuning all the knobs until everyone gets to their floor as fast as possible.


Please don't build an elevator whose ai is instructed to get people to their floor "_as fast_ as possible"


Someone still has to program the AI for the forseeable future.


If you're writing machine learning/intelligence, please just consider how to do so without condemning us to a dystopian future.


or if it's a dystopian one, at least make it a cool dystopian one.


So while maximising paperclip production, it should also manipulate the stock market to bring down the price of black leather pants?


no, don't, please. programming is a means to an end.

programming is fun the same way long division by hand is fun.

i can't wait for the robots to release me from the monotony.


Not everyone would agree with you :) Although I think even most of us who enjoy programming would be happy to have some form of automation as an option.


I suspect it's not going to be long until there's a rather creepy "nudity from source image" generator.


There's your killer app.

Feed it a million or so porn images of people with all kinds of different body types. Then have it guess the closest match. Finally run this. Presto change-o! It's those x-rays glasses kids everywhere wanted for years.

I could easily imagine it being done live-motion and 3-d. Run the whole thing on a set of AR glasses.


Or morph a face onto existing footage. I'm surprised this doesn't more obviously exist already, although I guess I haven't been actively looking for it.


The tech is there to do a rough approximation of a dozen combinations of this. I could imagine an intermediate step where a wire-frame mesh is constructed around the image. As I understand (And I know nothing of this stuff), there was already an app to take a picture of breasts and jiggle them.

I think it's just that nobody has put all of the pieces together yet. (Or if they have, the mainstream media hasn't heard about it)


Amazing how technology comes together, this seems like the type of thing people were dreaming about just a mere 10 years ago.


It's all fun and games until someone feeds it pictures of children.

Now you have an illegal information firehose.


I can't help but think that DiscoGAN (https://arxiv.org/abs/1703.05192 "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks", Kim et al 2017) would be perfect for this. Simply feed it a ton of photos of regular clothed people and other naked people (don't have to be the same people), and it'll learn a mapping on its own. Scale that up, refine it for a few years...


Discogan is amazing and the name, lol


Cue John Travolta?


I'm okay with creepy. Let me know when this comes out if you find it first!


Looks like I can't edit this to avoid gaining more down votes. I'm not sure if I'm losing points for asking to be informed of technological advancements or that I'm okay with being creepy. Any insight is appreciated.


It might not be obvious, but there are actual people whose lives are impacted negatively by creeps who will go to any length to get their fix.

Your comment would have been fine without the "I'm okay with creepy" part.


> able to copy perfect skin from a model onto everyone, making everything a little more fake and less genuine

Everybody will do it for a couple of years, then it will get old and people will have to think of something original. Just like autotune.


Only we still have autotune everywhere.

And 20+ years on, all cover shots etc are still photoshopped.


Yes but autotune is a style choice for artists. It's a not something every musician uses. It hasn't taken over music, and it's not like anyone can become a musician because autotune exists.


Disclaimer: I'm not a music producer, and my ear isn't that great, but I did write software for one of the major pitch-correction vendors for many years, and I think I have a better-than-average ear for it after listening to it for many years.

Pitch correction is something which is used on many, many professionally-produced tracks, and often without the knowledge or consent of the performers. Whether you can hear it or not is a stylistic choice (provided adequate skill from the production team: see [1]). But just because the pitch correction isn't in your face, T-Pain- or Cher-style, doesn't mean it isn't there. The software is better than that, and in the right hands, it just makes people sound more skilled than they are, and you can't hear it.

Producers generally are pretty quiet about where they use it to mask blemishes in the performance, probably because they don't want to embarrass anyone. But the producers we sold to would certainly say how much they used it, without naming artists or tracks.

[1] http://productionadvice.co.uk/aretha-autotune/


I believe I hear pitch correction whenever it's used. Do you have an example where it is used and I would struggle to hear it?


I was involved in the recording and production for a top 40 producer, and can confirm that there was autotune on every single vocal track that left the studio.

Here are a few that I was in the room when the artist was recording, and can confirm pitch correction:

https://www.youtube.com/watch?v=450p7goxZqg

https://www.youtube.com/watch?v=M8uPvX2te0I

https://www.youtube.com/watch?v=E0oyglKjbFQ


It would've been much better if you posted 6 links, 3 with and 3 without autotune. See if people can figure out which is which.


The first one has that metallic sound that is a dead give away. First falsetto is quieter than the second one, and you can really hear as he increases the loudness of his voice, the metallic kicks in https://youtu.be/450p7goxZqg?t=1m27s

Second one has a "Cher moment" almost straight away, just after "wandering the desert a thousand days" the following "mmmm" has a glissando between two notes where we clearly hear the hard edge on what I assume is an auto tune lookahead. I don't actually know how they work, I just assume there's a lookahead for the next note approximation which makes glissandos sound funny. https://youtu.be/M8uPvX2te0I?t=31s

The last one I can't really fault for too much autotune, more a lack of it. The bridge is especially intense https://youtu.be/E0oyglKjbFQ?t=1m51s


Say the singer loses the pitch slightly for half a second on a held note. If that fluctuation is corrected, what auditory information could be left for you to detect the modification?


I believe I hear pitch correction when it's obviously used, and it's a lot. Pretty much most of the "top forty" pop pablum from the last 20 years. I believe there is pitch correction that I don't spot: the "dark matter" of pitch correction that is done less cheesily.

The worst of it sounds almost like packet loss concealment in a G.722 voice stream: the sustained part of a vocal note basically sounding synthesized.


>Yes but autotune is a style choice for artists. It's a not something every musician uses. It hasn't taken over music, and it's not like anyone can become a musician because autotune exists.

You'd be surprised on both fronts :-)

On the first because autotune is prevalent regardless of genre and style choices (even in rock, country, etc). It's just the Cher/T-pain effect that has been toned done, but autotune is very much in use in the industry for vocal correction.

On the second, because almost any crap singer pop idol with nice looks can pretend to be in tune and pass out bearable results because of autotune.


And it's awesome. Music shouldn't be singing olympics; self-expression and original ideas are worth more than technical skill.


We disagree on that, and that's OK. To me there is something special about a live performance. Even more so when it's challenging for the performer. When a signer demonstrates range, or a musician displays technical excellence or provides emotional depth through expression it adds a LOT in my opinion. Knowing this is all faked in recorded music takes something away from it.

Ditto for photography. To take an image and retouch it, or to artificially saturate colors can make a great picture. But with a raw photo it's even more interesting to think that scene actually existed and someone captured it for us to look at.

In either case, I can enjoy the work but will only be impressed if I know that it's authentic. This is more true than ever today.

Holography, you can't currently fake that.


But then again, where do you draw this line for what is authentic and isn't?

In music are you allowed to use amplifiers and speakers? They can add a lot of color and distortion. How about reverb? Rooms that aren't there. EQ to remove unwanted frequencies? Synthesizers? Digital effects? At what point is it not authentic anymore?

Same for photography. Are you allowed to touch the aperture? ISO? Shutter speed? Flash? Digital camera? At what point is it not authentic?


The thing is, the subject of a photo, the scene and its subjects, are usually not the artists. The photographer is the artist. Photography is processing from the get-go: how the film or CCD responds to light and so on. The grain from low-light on sensitive film can be part of the art and so on.

If you mess with the colors of a scene, you're not taking away artistic control from that scene.

You also don't put limitations on the post processing art; you're not doing it to fool anyone.

There is post processing in music that is obvious art in an analogous way, like taking sampled sounds and re-mixing them to create new stuff. There are effects that are obviously effects. I'm not going to scoff at a great studio reverb, or some echo applied to a vocal or whatever. Nobody is saying that this was recorded in some fjord in Sweden with real echo bouncing off a distant ice wall; there is no lie.


In that case, we are just transferring artistic control from one human into another. In the past recording audio had fairly little artistic control and the subjects of the recording most of the control. Now with better audio manipulation software the person doing the recording has artistic tools at their disposal. They are the 'photographers' of the scene, while the singer is the subject.


That's right, and the subject might as well be a dog, or any other audio signal source, just like anything that reflects light can be a suitable one for the photographer's creative process. Cute puppies; water lilies; sunsets ...

The thing is, I somehow don't hear the studio's creative input either when I hear the latest auto-tuned Fido or Bowser. They're just applying some automatic something that's supposed to make the dog sound like a more able dog.

This is like when people just batch apply the same color enhancement and sharpening of their Florida vacation pictures. I've seen one instance, I've seen them all.


This is where auto-tune actually can fall. Singing is not exactly all about hitting the "in equal temperament tune" note all the time.

Take the fantastic singer with great technical skill. Most pitch correction algorithms, as far as I know, are strictly based on equal temperament / 12TET. Fantastic singers are capable of hitting the right harmonics, some of which are not 12TET. Fantastic singers slide into notes, they use vibrato, they add "blue notes" (https://en.wikipedia.org/wiki/Blue_note). If you over-apply pitch correction, in other words, you could easily make a fantastic singer sound worse.

Let's also take the singer who is technically a bit pitch deficient, but has "character" that makes up for it. You don't want to make this type of singer too in-tune, either. Too much tuning might remove the "character".

I understand in the industry there are some engineers that are good enough to selectively apply auto-tune, to only fix obvious issues, and avoid the pitfalls. There are also some productions that just apply auto-tune to everything with no consideration of the content. The later will probably work for glossy pop productions, but if I was a really good singer (or a singer with "character") I probably wouldn't like the results.


That's not true, melodyne is capable of arbitrary tunings, and even has a feature where you can create custom scales.


Thanks for the information, that's one product I'm not too familiar with. I've demoed the Antares product and a couple of freebies. (It seems like there is a couple of newer plugins out there as well, eg Synchro Arts Revoice Pro).

The problem is, I'm not sure though that even a pure alternate tuning can work though for all examples. EG: for blue notes, what is "correct" varies with performers and style.

Now, I'm more talking about the "automatic modes"; I understand Melodyne offers a pretty impressive level of editing control (Antares did too from what I remember). So it would certainly be possible to get a really great take, and then hand-correct any truly off notes to whatever frequency you wanted.

As in many things (see: Photoshop and model photos), a lot of the reaction to the tool is less on how it could be used, and more on how it is being used in glossy "crap singer pop idol" productions.


>Music shouldn't be singing olympics; self-expression and original ideas are worth more than technical skill.

Only it's about neither, but more about hot bodies and rich marketing campaigns.


Depends on what kind of music you're listening to


I'm sure a lot of the music you hear is using autotune without you realizing it. It's not some fad.


I'm sure all of it is.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: