Friday, June 5, 2020

Golang Shutdown Flow



While doing some enhancements to the Golang microservices at work I came across quite a few calls to logrus.Fatal late in the execution of the service. Some of these particular services are long running processes that consume from Kafka and write to GCP Spanner. The problem with logrus.Fatal when called late in these services lifecycle is that Fatal internally calls os.Exit(1). So let’s examine why this is bad for the system in which the service is running.

On the surface I wonder if a non-recoverable error encountered late in the process is “Fatal”. The interpretation of what “Fatal” means is subjective. At the beginning of the process when reading the configuration file, making external system connections -- this is a Fatal problem because the service can even get started. Some precondition failure -- yeah, that’s Fatal. But if a service has been happily consuming messages and writing transformed data to a database only to encounter a non-recoverable error -- is that “Fatal”?

But that’s not what I wanted to show you. What I wanted to show you is why logrus.Fatal(...) is the wrong way to shutdown a service that has encountered a fatal error.

First some basics: https://play.golang.org/p/b8CAlmiVZPH

package main

import (
"errors"
"fmt"
)

func main() {
defer func() {
if err := recover(); err != nil {
fmt.Println("chance to recover:", err)
panic(err)
} else {
fmt.Println("nothing to recover")
}
}()

fmt.Println("Hello, playground")

panic(errors.New("It's a perfect time to panic! -- Woody"))
}


This is basic Golang: the app panics, the defer will execute, recover() will consume/catch the error, and then we re-panic (just as a naive solution). The panic is reported, and the application returns a non-zero status code.

But what about os.Exit(1)? We know that logrus calls os.Exit(int). We need to be aware of the impact upon our defer statements when we use os.Exit(int). As noted in the godoc, os.Exit(int) does not take time to run defer functions. It's just going to shutdown: https://play.golang.org/p/fB8GnFxEATs
package main
import (
"fmt"
"os"
func main() {
defer func() {
if err := recover(); err != nil {
fmt.Println("chance to recover:", err)
} else {
fmt.Println("nothing to recover")
}
}()
fmt.Println("Hello, playground")
os.Exit(0)
}

Here the defer does not run. If you were going to gracefully release the connections to Kafka, Spanner, or any other external resource -- that did not happen. For illustration sake I also have this example returning the success zero status code.

There is another way to exit a Golang application. Well, sort of: runtime.Goexit(). As noted in the godoc all registered defer will be executed.  However, it only exits from one goroutine.  So if this is called in your last goroutine, your service will crash -- in the same fashion as when all goroutines are blocked causing deadlock: https://play.golang.org/p/z0k56ZMoF8N
package main

import (
"log"
"runtime"
)

func main() {
defer func() {
if err := recover(); err != nil {
log.Println("chance to recover:", err)
} else {
log.Println("nothing to recover")
}
}()

log.Println("Hello, playground")

runtime.Goexit()
}

Where does that leave us? Well, it's important to understand the impact of your libraries upon the flow of execution. The logrus.Fatal(...) method is definitely useful. Use it with the full understanding of what it is doing. Use it during service initialization before any defer functions have been registered. Use it when you know you want defer statements to be skipped.

Bonus:


It is important when fixing these sorts of problems with service shutdown to recognize the significance of your services exit code. Your services exit code is its last communication with the software architecture -- it's dying breath used to wheeze out one little death rattle.  Are you running your services in Docker? Kubernetes? 

The service exit code is going to communicate to the container if it exited successfully or crashed with an error. In those situations where you were calling logrus.Fatal, you likely do not want to log the error and simply return. That would have your service return zero as exit code communicating success to the software architecture system. Make sure you take into account the pod and the configured restartPolicy. If your service is shutting down because of a non-recoverable error you likely want to wheeze out a death rattle of non-zero.

Wednesday, February 15, 2017

Password Reset Flow With Native Android

The Problem Scenario

I want a native Android application that will pull the user back after they have used their mail/SMS client to continue the password reset flow:

  1. using the native app, our user requests password reset via email or SMS
  2. the user presses link in their email or SMS client
  3. the user opens the native app to complete password reset
We write native applications for a richer user experience. For all of the valid security reasons the password reset must be out of phase. By pulling the user back into the native app to complete the password reset flow we can guide them to the native experience that we want to provide. (that we have spent money and time creating)


The scenario is defined by the following two gherkin test cases.
Given the user has submitted a password reset
And the system has emailed the long url with the website's address
When the user presses on the link in their phone's mail client
Then the native android app is offered to handle the URL
And when the user chooses the native android app they are taken directly to the screen to save a new password
Given the user has submitted a password reset
And the system has sent an SMS with a bit.ly shortened URL
When the user presses on the link in their phone's SMS client
Then the native android app should be offered to handle the URL
And when the user chooses the native android app they are taken directly to the screen to save a new password

The Android Activity Registration

In Android we can provide one Activity that handles completing the password reset flow. That Activity needs the appropriate intent filters so the operating system knows it can handle the long and the short URLs:

<activity
    android:name=".authentication.CompleteResetPasswordActivity">
    <intent-filter>
        <action android:name="android.intent.action.VIEW"/>
        <category android:name="android.intent.category.DEFAULT"/>
        <category android:name="android.intent.category.BROWSABLE"/>
        <data android:scheme="https" android:host="${host}" android:pathPrefix="/passwordreset"/>
    </intent-filter>
    <intent-filter>
        <action android:name="android.intent.action.VIEW"/>
        <category android:name="android.intent.category.DEFAULT"/>
        <category android:name="android.intent.category.BROWSABLE"/>
        <data android:scheme="https" android:host="m.my.bitly.domain"/>
    </intent-filter>
</activity>

Did you see that "${host}". That is resolved in build.gradle via:
....
buildTypes {
   debut {
      ...
      manifestPlacehodlers = [host:"qa.mydomain"]
   }
   release {
      ...
      manifestPlacehodlers = [host:"www.mydomain"]
   }
...

The Code

Android delivers the URL pressed by the user via the getIntent().getData() method. Which is an android.net.Uri instance. Play with it. Massage it. Turn it into whatever you want. For the url shortened Uri you will of course have to resolve that thing into the full URI. Perhaps you will be using either the REST or Android bi.ly API -- https://dev.bitly.com/.

You will notice that your CompletePasswordResetActivity is launched as the root of it's Task that has affinity to the email or messaging app. Tasks are tricky in Android. These aren't things I've had to deal with much in the past. But tasks and task affinity are things you will need to understand if you want to provide this type of user experience. But that's all I'm going to say about that here. Dealing with the task and getting the user back into the "main task" is worthy of it's own post.

Quality Assurance

There are a plethora of scenarios to test. For the email flow, does the user use gmail, outlook-web, some other web client, some other native mail client? For SMS you have each carrier's custom messaging client as well as Hangouts and other options from Google. I have so far tested with:

  • my Project Fi Nexus 6. I am stuck with Hangouts.  It does NOT work. Hangouts launches the link into it's own internal browser.
  • a Verizon Motorolla Droid.  That phone has Messaging, Messaging+ (Verizon's offering), and Hangouts. The above solution works as expected in all three SMS clients. We can open Gmail and press the full URL link as well.

So.... More testing is needed. Samsung has a wide following for sure. So I'll test on a few flavors of those.

Friday, September 16, 2016

Android Font Settings To Enable Font Variants

Today I learned that fonts often have settings to enable alternate representations of particular characters. For example Gotham is not a monospaced font.  However, if you enabled the "tnum" setting for your Android TextView, then the font will render as monospaced. That is cool!

It appears Android is supporting a W3 standard with this feature. The documentation has a link that references CSS Fonts. Furthermore, this method was added as part of API 21. So unfortunately your users on older API will not see the awesome column layout you can produce.

Android TextView Documentation

In code this would look something like this:
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.LOLLIPOP) {    grandTotal.setFontFeatureSettings("tnum");}
or perhaps you are targetting API 21 and can apply the XML setting so that any string put into the field uses the "tnum" or other settings:

<TextView   
    android
:id="@+id/grand_total"
    android:layout_width="wrap_content"
    android:layout_height="wrap_content"
    android:fontFeatureSettings="tnum"
    tools:text="$30.35"/>

Or perhaps you are targeting API 19 and your TextView has a style set that you can override in the values-21 directory:

<TextView
    android:id="@+id/three"
    style="@style/example"
    android:layout_width="wrap_content"
    android:layout_height="wrap_content"
    android:text="$20.13" />

Then you can define a base style configuration in values/styles.xml
<style name="example"> <item name="android:textSize">24sp</item> </style>
and then apply the fontFeatureSettings in values-v21/styles.xml
<style name="example"> <item name="android:textSize">24sp</item> <item name="android:fontFeatureSettings">tnum</item> </style>
I think the screenshot below with two emulators illustrates the difference well. You can clearly see that the columns do not line up between the three TextView fields.
On the right with "tnum"



Tuesday, August 30, 2016

Constant Code Reivew

Pair programming is constant code review.

Is there a column on your Kanban board for "Code Review"? Do you pair program? Why would you need that code review column?  That column is counter intuitive. It is not an agile software practice. I am not proposing you never code review. Actually the opposite. You are more likely doing constant code review.

We know that bugs are easier to fix the earlier they are found. So we pair program. Fix the bug as soon as it's typed with an attentive pair. You save time by doing it right the first time. Each time you or your pair inadvertently types a bug is a learning/teaching opportunity. Use these moments to talk and internalize how that bug came through and make a mental note to not let it happen again. When you make the mistake is the best time to learn from the mistake. So make sure your pair is being attentive and reviewing the code you type.

How big are your stories? Do they take more than one pairing session? Every time you pair-switch the incoming person should be reviewing the code. The review should cover the design patterns in use as well as looking for typical pitfalls where bugs crop up. Sure, there are probably other things that happen during pair switch. Make sure you review the code that came before! Maybe your stories are not bigger than one pairing session.  It does not matter.  Hold your pair accountable to be an active copilot.

The Agile Manifesto says "Individuals and interactions over processes and tools".  A code review column is putting process over your team members. We already established that you are pair programming. Why would you need to declare code review only happens after the pair team thinks the story is complete. What!? That doesn't make sense. How can the story be finished if it still needs a code review. Invest in your people. Don't let them use a code review column as a safety net that catches problems. The tightrope walker that has no net is much better at their craft than the one using a net.  They have to be or it's a really short career (grin). Take away your safety net to get better at YOUR craft.

Saturday, December 27, 2014

Quick Android Ringtone

My son was making himself at home in his Android phone Christmas present yesterday. He wanted a particular guitar solo as his ringtone.  Here's how I put it together. Spoiler alert, it's much easier than this:

  1. Slice out the guitar solo using itunes
  2. Convert the MP3 to ogg
  3. Added the ANDROID_LOOP metadata
  4. Copied the ringtone to the phone

Slice out the guitar solo using itunes

You can configure iTunes to export songs in MP3. You can also tell iTunes to start/stop playing at certain points in a song.
  1. select the song you want to use
  2. CMD-I to open the settings dialog
  3. Go to the Options tag to enter your start/stop time. This will likely take some fiddling to get the slice you want. With these set, the song will only play this section.
  4. Now open the File menu -> Create New Version -> Create MP3 Version
  5. You probably want to go back to the CMD-I properties dialog to clear the start/stop time of this song

Convert the MP3 to ogg

A drag-n-drop later I had an ogg file out of the mp3 by using Media Human. This ogg will work as a ringtone. However there is a long pause before it loops. This is not what we wanted. 

Added the ANDROID_LOOP metadata

I found Audacity to add the loop metadata key/value pair. Drag-n-drop the ogg file into Audacity. Then File menu -> Export Audio. Choose your destination file location and press the Save button. Now you get a new dialog where you can enter the new metadata key/value pair: ANDROID_LOOP:true.

Copied the ringtone to the phone

Android File Transfer works slick. Drag-n-drop the file from a Finder window into the Ringtones directory of android file transfer.  You don't have to disconnect the USB cable, navigate on your phone to Settings->Sounds and pick your ringtone!

Conclusion

In the end I could have just used Audacity since my music library is already MP3 format. I did not have to get iTunes to export an AAC into MP3. Audacity will let you select a section of song by clicking and dragging. Then further adjust the start/stop points. Simple go to the same File menu -> Export Selected Audio.

Sunday, March 23, 2014

Scala Play Framework Template Imports

The template compiler is pretty sweet. Using a template like a function is awesomely simple.  If you have a bit of html code that makes up a reused block on more than one page you just factor it out to it's own ___.scala.html. Then reference it with the "magical" @ character.

However, the error messages that the template engine provides are less-than-detailed. I had created a subdirectory named: app/views/tags. When I tried to use one of the template functions out of that directory I got the error: not found: value gallery

So let's take a look at the important parts.

I have templates:

  • app/views/main.scala.html
  • app/views/tags/gallery.scala.html
With this as the important parts of main.scala.html

@(artist: models.ArtistModel,
      tags: List[String])(implicit artistModel: Option[models.ArtistModel]) @import tags._
.....
...
@gallery()....
Hmmmm, "not found: value gallery"? Why can't it find gallery? I imported it. It's formed correctly. Well, this is painfully obvious now, but it took a few minutes for me to reconcile that the "tags: List[String]" is clashing with the "@import tags._" 

More on the Play Framework Templates

Saturday, February 23, 2013

Are You Following the Golden Rule?

Well, are you? I'm often asking my sons that exact question. The Golden Rule: Treat others the way you want to be treated. Pretty simple. No room for interpretation. I may have heard it growing up. I don't remember. The Golden Rule was core to the culture at (the company formerly known as) A.G. Edwards & Sons. Ben Edwards often referenced it in his monthly news letters as he visited branches. Mr. Edwards would inevitably digress to how the food tasted and how that branch was following the golden rule. But now I digress.

It occurred to me the other day that The Golden Rule should not just be about how you treat other people. But how you think about other people. To my sons (11 and 8 at this time) this applies to knowing without hesitation that there is no need to say, "Stephen, don't break it!" But as software developers, we should remind ourselves, "This crap code I'm looking at, well, I'm sure the person who wrote it had good intentions and isn't just an idiot." Because remember, the crap code you're looking at just might be your own.