Skip to main content

ML / Using Turi Create for Machine Learning Style Transfer



A few days ago, I found out about "Turi Create", basically a Python module, a tool from Apple that allows non-Apple computer user to generate MLM aka Machine Learning Model that can be integrated inside iOS app on iPhone and iPad.

The cool thing about Turi Create is that you can also use it for your own purpose in processing data using Machine Learning.
https://developer.apple.com/videos/play/wwdc2018/712/

As an additional note, at WWDC 2018, Apple also presented XCode Playground tool for MacOS that allows user to easily generate this MLM model for all kind of purposes. The process is as simple as drag and drop for Image Classifier. It is quite fascinating and worth watching the video.

STYLE TRANSFER

From what I gather after a short research, there are a few applications of Machine Learning that we can use in application such as for Image Classification, Graphing, etc but one that interest me in particular is STYLE TRANSFER using Turi Create:
https://apple.github.io/turicreate/docs/userguide/style_transfer/

You might heard or use a famous "style transfer" app called PIKAZO in the past and in which this app does allow something like Deep Dream Generator or further more advanced Style Transfer that gives a very abstract mosaic look and to the unknown.

I am a self taught programmer or coder, with basic knowledge of codings in Python, Swift. And for this, we will be using Python language.

TURI CREATE is really made for ease of use, Apple does provide us with lots of handy procedures to generate ML model and for Image Transfer, you can basically visualize TABULAR DATA and quickly generate STYLIZED images.

I did a few testing and the process of generate ML model for style transfer is easy enough. It takes a bit of time, but I will explain the whole process.

At this point, from what I observe, the resulting STYLE TRANSFER is more or less like a FILTER. A realtime example can be seen on Apple Clips app that does this style transfer in realtime. As well as iOS 12 FaceTime and iMessage realtime video filter, a slightly higher quality filtering can be seen.

STEP BY STEP PROCESS OF GENERATING & TRANSFERRING STYLE

STEP 1.
COLLECT and SAMPLE your machine learning data images for CONTENTS AND STYLES


Interestingly enough, we need a few images for contents (recommendation is 300-500 images), as well as a bunch of images for style. So far I am actually using 11 images for style.

For my first test, I am using Artworks by Yoji Shinkawa (Metal Gear Solid) as STYLE learning. I thought: Imagine if we can really beautifully get image that is more or less based on this style!

In reality, it is not actually that simple. I am not 100% sure if Machine Learning and Style Transfer produces "ink style strokes" as well as the color "exactly" but maybe in the near future.

For now anyhow, I am just curious of how things will turn out.


I randomly picked about 11 random images (I could have provided more data to work with. That is for my future testing, or you can try it yourself).

Here, I am training the model using Photos and Illustrations. There are a lot of variables you can try. Maybe certain Painterly look or certain Patterns and Textures can be recreated.

There are examples out there that produces beautiful and distinct results of "style transfer". Van Gogh one is in particular quite interesting and very unique with swirly type of strokes.

I personally like to have results resembling manga, anime or some classical brush painting ukiyo-e.

STEP 2.
Let TURI CREATE generate Machine Learning Model


So the next step is to generate the MLM model. I am using Jupyter Notebook to make everything tidy, but basically the code as below.

import turicreate as tc
style = tc.load_images('style/') # get images from folder named style
content = tc.load_images('content/')

You can resize sframes manually, but:
style = tc.load_images('style/')
style['image'] = style['image'].apply(lambda img: tc.image_analysis.resize(img, 256, 256, 3))
style.save('resized.sframe')

At this point, GPU is not yet supported for "Style Transfer" (it might already be supported for other machine learning purspose), so let's do "slow" CPU processing for now, but in the future you can specify how many GPU to use:
tc.config.set_num_gpus(0)


Execute MLM Style Transfer with specific number of ITERATIONS:
model = tc.style_transfer.create(style, content, max_iterations = 1000)

Or simply runs it with the auto iterations (I am getting 5000 here and running for days)
params={'print_loss_breakdown': True}
model = tc.style_transfer.create(style, content, _advanced_parameters=params)

You will get a nice log printed out during the process.

This will take some times with CPU processing. Ideally it should run with GPU and a lot faster.

100 iterations will give you basic standard filtered look. I remember OpenToonz Chainer actually gives result similar to this.




1000 iterations on my MacBookPro 2017 is around 7 hours. The result is already pretty good.



5000 iterations took days, around 50 hours maybe, but with cleaner result. Perhaps there are other variables that I have not touched. But will further study. I don't think I need 5000 iterations if my data samples is small.

explore() method function to give gallery view



Supposedly, there are ways to make style stronger on the applied image however, I have not tried on that. I do want a more abstract result.




STEP 3.
Saving the STYLE TRANSFER ML Model


This part is quite magical actually the saving and exporting of model that you can reuse to apply style. One way to save model:

model.save('style_transferX.model')
model.export_coreml('MyClassifier.mlmodel')
model = tc.load_model('style_transfer.model/')

At this point, you actually can share the model for other to use. Perhaps even drag and drop the MLM as an iOS app in XCode.

STEP 4.
Applying Image Style Transfer to Model

Now, to use the Model and apply it into image(s). This will take just a few seconds. Image is scaled down automatically to around 800 pixel dimension.

Showing Images:
sample_image = content['image'][1]
sample_image.show()

Apply Style to Images:
stylized_image = model.stylize(sample_image, style = 1)
stylized_image.show()

Saving stylized image we need to use PIL for now, but module might be updated in the future:
from PIL import Image
Image.fromarray(stylized_image.pixel_data).save('blah.png')

EXAMPLE CODE TO PROCESS BATCH OF IMAGES USING 1 STYLE: (great for animation, I tested using iPhone Live Photos)

for i in range(0,58):

    sample_image = batch['image'][i]
    stylized_image = model.stylize(sample_image, style = 2)
    #stylized_image.show()
    output_name = 'bounce_{:03d}.png'.format(i)
    Image.fromarray(stylized_image.pixel_data).save(output_name)


EXAMPLE CODE TO PROCESS ONE IMAGE WITH ALL STYLES:

# use ALL style on an image

for i in range(11):
    sample_image = blah['image'][0] # from folder, get the first image encounter
    stylized_image = model.stylize(sample_image, style = i)

    #stylized_image.show()

    output_name = 'blah_{:03d}.png'.format(i)
    Image.fromarray(stylized_image.pixel_data).save(output_name)

You can force maximum size output of the image, but if too big it will crash:
stylized_image = model.stylize(sample_image, style = 8, max_size=1200)

Use the built it EXPLORE() function to produce a nice quick view preview gallery of the sframe images:

results = model.stylize(blah)
results.explore()

If you want to use a DIFFERENT STYLE, you need to provide a new image style data, repeat the whole process above.

I am really curious whether this Turi Create Style Transfer can actually produce a much more abstract result. Maybe it can.

SO MUCH EASIER TO STYLE TRANSFER...

In the past, I have tried all kind of ways to do this "style transfer". Installing and making thing to work with GPU with all kind of different language, including usage of container and virtualization... so much headache!

I think this TURI CREATE indeed makes the process so much easier for Artists to tinker with this machine learning.

I am excited on how all these can be further explored and improved in the future. Let me know what you think.


Comments

Popular posts from this blog

WOLFRAM / Making Text With Rainbow Color

Continuing with my Wolfram Mathematica Trial Experience... I watched and went through some more Mathematica introduction videos, read lots of Mathematica documentation and also going through the Wolfram Lab Online again a few times. There are some major learning curves and Mathematica is a lot different from normal programming language. Sometimes there is a lot of interesting "shortcuts", say like: FindFaces[] , WordCloud[] . Sometimes I got a little confused on how one can do iterations. Normally FOR LOOP concept is introduced early, but in Wolfram, because everything is EXPRESSIONS and ENTITY (like OBJECTS), sometimes it gets quite quirky. Mind you I am still in the first impression and having to look at many tutorials. Lots of NEAT EXAMPLES from documentation, but sometimes I got lost. I found Wolfram to be really awesome with LIST and generating list. It's almost too easy when it works visually. I cannot explain them with my own words yet, but there are ...

PYTHON / OpenCV, Recreate Uncanny Manga - Anime Style

Can you tell what it is? Computer Vision. Yesterday, I spend almost whole day exploring this opencv module using Python. What I discovered was revealing. Even at the very basic level, I could produce some interesting Image and Video manipulation using all the code collected from documentation and many, many blog tutorials. If you are a total noob like me, I am still getting used to knowing that the CV in OpenCV means Computer Vision! Actuallly, I recalled that I did try to get into OpenCV few years back ago, when I knew no Python and when Python opencv module was probably still early. It was all C++ code and it was a little bit too hard for me. I read a couple of books about opencv at the library, I did not understand a single thing. That was back then. Today, for strange reason, with a bit of knowledge of Python, I can go a little further. EDGE DETECT IN OPENCV Me holding you know what. What leads me this far is my curiosity on how we can replicate Wolfram Langu...