Check out my new blog at https://shibumiware.blogspot.com

Tuesday, May 23, 2017

Google oAuth and Blogger API Deep Dive Part I



Overview

I am going to cover quite a bit in this post regarding using Google's Blogger v3.0 API and achieving specific results for a personal project of mine.  I will cover the internals of Google's oAuth
v2.0 implementation in some detail, discuss how to build a browser component that with browser isolation
used in a the project's WinForms app plus other commentary.   So far this is Part I in a series.

But first I am going to discuss the experience of blog authoring with Blogger given the type of writer I am, the tools I prefer, and what started this project I call "ShibumiWare Blog", which is the combination of my blog and a Windows App.

Blogger versus Wordpress and Other Platforms

I choose Blogger years and years ago.  I am not going to change because I like their APIs, their domain management facility is fantastic, and I have learned the system well over time.  At times I have almost quit the Blogger platform but frankly I just have too much invested in it.

I like the widgets, the simplicity of the templates I have worked with, and the deep level of customization you can perform to make it look just the way you like it.  In fact, and this is part of the problem and part of the fun, here is a screen shot of what it looks like when I write a post:




Click to pop-it out. It is
a perfect mess and I love it. Microsoft used to have a standards-based designer suite called Microsoft Expression that I really liked, but it has been discontinued. What you are looking at is Visual Studio 2017 with several extensions installed:




Anyway, it's kind of a hassle because at the end of editing, I cut and paste the contents of an inner portion of the page (the rest of the content essentially mocks-up what the blog looks like so I don't do something nuts that won't fit the style or dimensions of the blog template) into Blogger's
editor and post. 

I love side-projects and I have a ton of them so I thought I should
look at simply uploading the contents directly.  I don't want a full-blown
editor, because while complicated for many users, this combination of Visual Studio and extensions is powerful and suites me just fine.

Enter the problem: How to use Google's APIs to post the contents as a new blog post. 

 

Introducing Google Blogger API v3.0

I am not going to regurgitate the API documentation but I am going to highlight portions of the SDK you need to pay close attention to, including the use of code samples for purposes of clarity. 
The documentation will give you detailed information: Blogger API Doc.

The first task is to obtain an API key. Click here to open the credentials page.  The user interface is fairly intuitive so follow the flow until you are given the chance to "Create  credentials":

credentials

You have several options, but for my purposes, I needed an OAuth client ID, which would allow me to ask users if I can perform particular actions on their behalf within the application.  Once you have created it, you can download the associated JSON file with the details:

 {

  "installed": {

    "client_id""[FAKE] 1234456793-yadayadayadayadayadayada.apps.googleusercontent.com",

    "project_id""[FAKE] projectname",

    "auth_uri""https://accounts.google.com/o/oauth2/auth",

    "token_uri""https://accounts.google.com/o/oauth2/token",

    "auth_provider_x509_cert_url""https://www.googleapis.com/oauth2/v1/certs",

    "client_secret""[FAKE] tYc41IAIZO3SU75h/UFTSrir",

    "redirect_uris": [ "urn:ietf:wg:oauth:2.0:oob""http://localhost" ]

  }

In the API these are referred to as "Google Secrets" and you should treat them that way, although keep in mind that just having the secrets still requires that whatever you come up with to do with the secrets, the user still has to agree with.  See the
Crypto Available
post which contains a tool for encrypting and decrypting data, which is useful in just this sort of problem.

More on the API key and Google Secrets later. 

The next task is to configure your conformation screen, which will be displayed to the user when you app requests access to use the Blogger API (or any other Google API for that matter) on their behalf:


 

After you are done with these two tasks, you have the basics to get started.

 

Blogger API v2.0 & OAuth v3.0

First, let's talk briefly about what I started out trying to accomplish.  I wanted to use my Google Secrets to allow a user
(me) access to a blog (mine) and explore the Blogger API while achieving my primary objective, which is posting automatically.  This meant using Google's oAuth implementation as well.   My app is a Windows app.  Keep that mind and here is a early version of th
user interface:



App

Authentication

 

I am going to show you two flows, one that uses the standard Google oAuth behavior, and another flow I figured out that makes a lot more sense for a Windows app. 
The second flow appears in Part II of this series.

The first flow starts with a click of the Login button (button in upper left corner):

_Blogger = new Blogger(Resources.GoogleBloggerv3Secrets);
 
if (Settings.Default.UseEmbeddedCodeReceiever)
{
    string browserProcessPath = Path.Combine(Application.StartupPathProductConstants.SW_BROWSER);
 
    _Blogger.UseEmbeddedCodeReceiver = true;
    _Blogger.CodeReceiverPath = browserProcessPath;
    _Blogger.CodeReceiverArguments = @"~attach";
}
else
{
    _Blogger.UseEmbeddedCodeReceiver = false;
}
 
_Login = _Blogger.Login(Environment.UserName);

Notice the Blogger object is instantiated with my Google Secrets, which has been decrypted and is ready to go.  Based on settings, I choose not to use the "Embedded Code Receiver", which I will explain later.

NOTE:  When you use Google's oAuth implementation, it uses a FileDataStore object
to store the token it receives when the user accepts the application's request for access to the blog.  The samples that I found showed lot's of "login" examples that used either the user's gmail account to "login" with, but how would they know that if the user isn't
prompted for it and secondly, who cares?  The user has to accept the application, which then gives you access to this information anyway.  The most important thing is there is one token stored for each user on the device. 

In the case of Windows, the storage location I chose is [OS-Drive]:\Users\[user
name]\AppData\Roaming\ShibumiWare\GoogleBloggerv3.  You could use anything you want really and the examples are not clear on this.  Even the open source LiveWriter developers appeared confused about how this worked.  I cloned their GitHub thinking it would get me where I wanted to go.

It helped but it was weird and unclean.

Let's follow the regular, non-embedded code receiver flow and see what happens:

First the code goes through a function to see if the token exits and whether it is still valid (it has not expired): 

EnsureCredentials(new IdentityToken(identifier, string.Emptynull));

Inside the local store is examined to see if a valid token exists and whether it is valid:

GoogleAuthorizationCodeFlow flow = null;
 
try
{
    if (_SecretStream == null)
    {
        _SecretStream = GetClientSecretsStream();
    }
 
    GoogleAuthorizationCodeFlow.Initializer initializer = new GoogleAuthorizationCodeFlow.Initializer
    {
        ClientSecretsStream = _SecretStream,
        DataStore = GetCredentialsDataStoreForIdentty(),
        Scopes = new List<string>() { BloggerService.Scope.Blogger }
    };
 
    flow = new GoogleAuthorizationCodeFlow(initializer);
 
    Task<TokenResponse> loadTokenTask = flow.LoadTokenAsync(identityToken.IdCancellationToken.None);
 
    loadTokenTask.Wait();
 
    if (loadTokenTask.IsCompleted)
    {
        // We were able re-create the user credentials from the cache.
        userCredential = new UserCredential(flow, identityToken.Id, loadTokenTask.Result);
        token = loadTokenTask.Result;
    }
}
finally
{
    flow?.Dispose();
}

The GoogleAuthorizationCodeFlow
object holds the various data that is required to perform this "flow" like the Google Secrets, the file data store and the "Scopes", which are API scopes--like Blogger, or Blogger-Read-only, or any scope available for the API you are working with.  Here I just use the
BloggerService
.Scope.Blogger scope.  These flows do everything async.  The call to flow.LoadTokenAsync is just a wrapper around
FileDataStore.GetAsync.

In this situation, I have removed the token for the purposes of demonstrating a complete path through the authorization flow.   What happens next is where it starts to get interesting:


if (!IsValidToken(token))
{
    Task<UserCredential> authorizationTask = GetOAuth2AuthorizationAsync(identityToken.IdCancellationToken.Nonenull);
 
    authorizationTask.Wait();
 
    if (authorizationTask.IsCompleted)
    {
        userCredential = authorizationTask.Result;
        token = userCredential?.Token;
    }
}

GetOAuth2AuthorizationAsync is the important call:

public Task<UserCredentialGetOAuth2AuthorizationAsync(string userId,
 CancellationToken taskCancellationToken, BrowserCoreCodeReceiver embeddedCodeReceiver)
{
    _SecretStream = GetClientSecretsStream();
 
    if (UseEmbeddedCodeReceiver)
    {
        if (CodeReceiver == null)
        {
            CodeReceiver = embeddedCodeReceiver;
        }
 
        return GoogleWebAuthorizationBroker.AuthorizeAsync(
            GoogleClientSecrets.Load(_SecretStream).Secrets,
            new List<string>() { BloggerService.Scope.Blogger },
            userId,
            taskCancellationToken,
            GetCredentialsDataStoreForIdentty(), CodeReceiver);
    }
 
    return GoogleWebAuthorizationBroker.AuthorizeAsync(
        GoogleClientSecrets.Load(_SecretStream).Secrets,
        new List<string>() { BloggerService.Scope.Blogger },
        userId,
        taskCancellationToken,
        GetCredentialsDataStoreForIdentty());
}

We do some flow control regarding the use of the embedded code receiver and choose the method that does not include a code receiver. 
Here is what I get (with a very cleaned up desktop--usually I have 20+ windows open, which made this worse because it was a randomly popping up an instance of Chrome)

Desktop

Okay, so this isn't that bad, except it got me wondering what was going on in the background because here is what we know about oAuth 2.0's flow:

oAuth 2. Flow



My blog tool is the client.  It first looks to the local store for
the token representing the authorization grant.  It doesn't find it so it goes to the authorization server (https://accounts.google.com/o/oauth2/v2/) and asks for authorization, which the server will give depending if you select Deny or Allow. 
That access token is persisted to the local FileDataStore.  Subsequent calls to the protected resource (this case, the blog) are granted by having obtained authorization and persisted the access token locally.

So let's accept and see what happens next.  While this browser page is open (or the browser instance is open) but the user hasn't accepted or denied, the authorization flow Wait:

Task<UserCredential> authorizationTask = GetOAuth2AuthorizationAsync(identityToken.Id
CancellationToken.Nonenull);
 
authorizationTask.Wait();
 
if (authorizationTask.IsCompleted)
{
    userCredential = authorizationTask.Result;
    token = userCredential?.Token;
}
 
if (!IsValidToken(token))
{
    throw new ShibumiWareException(@"Invalid token"); // TODO better error message
}
 
identityToken.Token = userCredential;

Meanwhile, over in the browser notice two things, first we have a very lame "close the window" message and, more importantly, that message is being served up by localhost:55804/authorize. I don't have IIS installed on this machine with port 55804
open so what's the deal?



Now I am really curious, plus, and here is where the new goal comes to mind: I want to know what is going on in the background so I can design the authorization system to NOT jump out of my Windows app.

Time for some serious spelunking, but first a disclaimer:


  1. What I am about to show you is probably not out in the documentation with fine-grained implementation details
  2. The techniques I used to get at the details are not illegal, you just have to know what you are doing
  3. Do not rely on this data, which are implementation details and may change over time
  4. If I am wrong about my interpretation, I take no responsibility for any code incorrectly written because you took advantage of this information.

Deep Dive into Google's oAuth 2.0 Implementation

One this is certain, the API launches a browser and accesses Google's oAuth service.  The API waits for a response, but itsn't waiting from the browser process to close because in experimentation, it opened it as a new tab in Chrome (that is still process isolation, but that's harder
to wait on,
I didn't think they did it that way, and it also led me to another experiment in building my own browser with process isolation, just because I endued up curious about that--I get distracted).

Remember when I said spelunking?  Here is cave number one:

As-Is Implementation

How does the API launch a browser, wait for a response, and then seemingly create a response to the user ("we have the verification code, you may close this window, yadayda") locally on some weird port?

This is what I found:

browserOpenedOk = this.OpenBrowser(url1);
private bool OpenBrowser(string url)
{
  Process.Start(url);
  return true;
}

Yes, that is correct.  It passes the authorization URL to Process.Start, which looks at the protocol or file extension to find the handler for the http protocol and launches the default  browser: in my case, Google Chrome.  The next important call is the following:

authorizationCodeResponseUrl = await this.GetResponseFromListener(listener).ConfigureAwait(false);

The "listener" in this case is an HTTPListener


private HttpListener StartListener()
{
  HttpListener httpListener = new HttpListener();
  httpListener.Prefixes.Add(this.RedirectUri);
  httpListener.Start();
  return httpListener;
}

Notice that one of the prefixes (which can be a full URL or a pattern) is RedirectUri.  Okay, one problem figured out.  What's happening based on looking at data from Fiddler is that a local TCP listener is opened on that weird port.  When the authentication server is sent the
authorization URL, it includes this redirect URL, which once the Allow button is selected, the browser will "open" that address, which is really just a stream that can be written to like Response.Write).  That is how the message "the code has been accepted" message gets written
to the browser.

The this.RedirectUri is actually a computed property, not a property backed by a field or an auto property:


public string RedirectUri
{
  get
  {
    if (!string.IsNullOrEmpty(this.redirectUri))
      return this.redirectUri;
    return this.redirectUri = string.Format("http://localhost:{0}/authorize/", (objectLocalServerCodeReceiver.GetRandomUnusedPort());
  }
}

One clue here I took note of is "LocalServerCodeReceiver.GetRandomUnusedPort()".  Hmm, local code receiver.  Does that mean there are other types of code receivers?

The property does much more than that.  The call to above starts a TCPListener at an unused port, which is attached to the authorization querystring on the url retrieved from the authorization server:

private static int GetRandomUnusedPort()
{
  TcpListener tcpListener = new TcpListener(IPAddress.Loopback, 0);
  try
  {
    tcpListener.Start();
    return ((IPEndPoint) tcpListener.LocalEndpoint).Port;
  }
  finally
  {
    tcpListener.Stop();
  }
}

I will cut to the chase so we can move on.  What happens next, without showing you even more code, is the request is fired off to Google's authorization server, which includes all of the information required to figure out what consent screen and API key to use (remember the Google Secrets?). 
The user is redirected to the consent screen where they hopefully accept, which processes the acceptance, generates a token, sticks it onto the query string and redirects the browser to the local address, where the query string is picked apart, the token is created (its a JSON string) and saved to
the local store.  Boom!

You have authorized my app to access your blog!

As you can see in that fancy FileDataStore there is a token from Google.


Open it open and this is what you see:

{
  "access_token""[FAKE] ya29.GlsoBPScQfThMgV7Vf8d9TbwKwAU2VAkcLlCbsEFu4pNNZtKCzrYGJahTvl",
  "token_type""Bearer",
  "expires_in": 3600,
  "refresh_token""[FAKE]GMLEOnd3suxfsdfser323425AhoYwK4SX5PBhZcJLgD-E",
  "Issued""2017-04-09T14:01:55.016-05:00",
  "IssuedUtc""2017-04-09T19:01:55.016Z"
}

What's next?   The first thing I wanted to see if I could accomplish is to keep the user inside the blog application without popping out into a browser window to give my app permission to work with the blog.

Remember I said to take not of the LocalCodeReceiver?  As it turns out, the GoogleWebAuthorizationBroker has an overload of AuthorizeAsync that takes an ICodeReceiver object, which at the time I surmised would replace the current OpenBrowser technique used so I could keep the user in the app
for authorization:

public static async Task<UserCredentialAuthorizeAsync(ClientSecrets clientSecrets,
    IEnumerable<string> scopes,
    string user,
    CancellationToken taskCancellationToken,
    IDataStore dataStore = null,
    ICodeReceiver codeReceiver = null)

This looks promising so I go off into cave # 2

Replacing the Local Code Receiver with My Own

This one took me a little while to track down but when I found it, it is an amazingly simple interface:

public interface ICodeReceiver
{
    string RedirectUri { get; }
 
    Task<AuthorizationCodeResponseUrlReceiveCodeAsync(AuthorizationCodeRequestUrl url, CancellationToken taskCancellationToken);
}

Because I NuGe'ted the Google APIs into my solution, I created a new class that implements ICodeReceiver, with a couple of additions .  My general desire was to implement the
code
receiver using .NET's WebBrowser class, but two things made doing that directly impractical and ugly.  The impractical item was that due to the threading model and the call sequences in async, and running on the UI thread, made it a mess to synchronize.  Second, the ugly part, is
that I am implementing a class library for extending and wrapping the Google APIs and I hate to include System.Windows.Forms--» its just ugly.

This Concludes Part I



Again, if you are seeing this as a post on Facebook, don't respond there.  I rarely pay attention.  If you have comments you can email me directory at shibumi-at-shibumiware-dot-com or leave a comment on the blog


For completeness, here is basically the implementation of FileDataStore, just so you can see there is nothing special going on here:

using System;
using System.IO;
using System.Threading.Tasks;
 
namespace Google.Apis.Util.Store
{
    public class FileDataStore
    {
        private readonly string folderPath;
 
        public FileDataStore(string folder, bool fullPath = false)
        {
            folderPath = fullPath ? folder : Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.ApplicationData), folder);
            if (Directory.Exists(folderPath))
                return;
 
            Directory.CreateDirectory(folderPath);
        }
 
        public string FolderPath
        {
            get
            {
                return folderPath;
            }
        }
 
        public static string GenerateStoredKey(string key, Type t)
        {
            return string.Format("{0}-{1}", t.FullName, key);
        }
 
        public Task ClearAsync()
        {
            if (Directory.Exists(folderPath))
            {
                Directory.Delete(folderPathtrue);
                Directory.CreateDirectory(folderPath);
            }
            return Task.Delay(0);
        }
 
        public Task DeleteAsync<T>(string key)
        {
            if (string.IsNullOrEmpty(key))
                throw new ArgumentException("Key MUST have a value");
 
            string path = Path.Combine(folderPathGenerateStoredKey(key, typeof(T)));
            if (File.Exists(path))
                File.Delete(path);
            return Task.Delay(0);
        }
 
        public Task<TGetAsync<T>(string key)
        {
            if (string.IsNullOrEmpty(key))
                throw new ArgumentException("Key MUST have a value");
 
            TaskCompletionSource<T> completionSource = new TaskCompletionSource<T>();
            string path = Path.Combine(folderPathGenerateStoredKey(key, typeof(T)));
            if (File.Exists(path))
            {
                try
                {
                    string input = File.ReadAllText(path);
                    completionSource.SetResult(NewtonsoftJsonSerializer.Instance.Deserialize<T>(input));
                }
                catch (Exception ex)
                {
                    completionSource.SetException(ex);
                }
            }
            else
                completionSource.SetResult(default(T));
            return completionSource.Task;
        }
 
        public Task StoreAsync<T>(string key, T value)
        {
            if (string.IsNullOrEmpty(key))
                throw new ArgumentException("Key MUST have a value");
 
            string contents = NewtonsoftJsonSerializer.Instance.Serialize((object)value);
            File.WriteAllText(Path.Combine(folderPathGenerateStoredKey(key, typeof(T))), contents);
            return Task.Delay(0);
        }
    }
}

 

Shibumi

No comments :

Disclaimer

Content on this site is provided "AS IS" with no warranties and confers no rights. Additionally, all content on this site is my own personal opinion and does not represent my employer's view in any way.