tag:willchen.me,2013:/posts Will Chen 2016-07-05T05:20:16Z Will Chen tag:willchen.me,2013:Post/1069224 2016-07-03T02:25:07Z 2016-07-03T02:25:07Z Event Log Workflow

Why event logs are used by data systems and accountants

Mature data systems use an event log mechanism, which means they track every change.

Let’s consider the canonical example of a bank with a database of its customers’ financial records. A simple implementation of this database would be a row for each customer account and columns for the account id and the account balance. Whenever a transaction occurs, the account balance is adjusted. Each transaction information, once it’s been accounted for, is promptly discarded to save space.

This implementation is problematic for several reasons:

  • What if the customer wants to verify the accuracy of the account balance? The simplest way to do this would be to provide a record of each transaction since the inception of the account.

  • What if the credit card is stolen three days ago and the bank wants to undo those recent transactions? Since none of the transaction information was stored permanently.

  • How do you handle concurrent updates to the same account? If two threads both see $100 as the existing balance and they each try to update the balance to a new number, will you end up losing one of the updates?

Event log is a simple but powerful concept where you store each event permanently and then later generate views based on those events. Much like how an accountant never “erases” a record when a debt is paid off, an accountant only adds a new entry to the general ledger.

The only two downsides are the storage cost of keeping each event and computation cost of generating each view. In real-world systems, you may end up compacting old events into a snapshot. Likewise, an accountant might calculate the current balance by looking at last year’s audited account balance and then tallying all the transactions in the current year. In theory, one could generate it by tallying all the transactions since the inception of the audited firm, but that would be excessive work with negligible benefit.

Using event log to “Get Things Done” (GTD)

We can use the same event log methodology and use it to keep track of everyday activities to maximize productivity and ensure correctness.

The actual category names aren't important but I'll tell you mine to make this description more concrete.

The first group is called in progress and these are the tasks that I’m working on right now. Typically I'm only going to have one task that I’m working on at any given point. Sometimes, I might be waiting for someone else before I finish a task (e.g. getting a code review) and in that case it's okay to have two tasks in progress. The goal is to limit the cognitive load by focusing your attention on one thing and doing one thing well: you might recognize this as the Unix philosophy.

The second group is called upcoming -  this is the work I haven't done yet. Each of these tasks should be discretely defined with a clear definition of what “done” means. And if I can't precisely define the task yet, that’s the first sub-task that I will do for that task. The goal is to minimize the number of large hairy tasks that I might feel inertia to starting and instead have many small tasks that I can accomplish in a day or so.

The last group is called completed and these are all the tasks that I’ve finished previously.  

What's important to note here is that I never delete a task. The lifecycle of a task is that it starts in the upcoming group and then it goes to the in progress group and then finally it lands in the completed group.

If one of the tasks that I'm working on isn't needed anymore, I won’t actually delete it. Instead I’ll move it to the completed group and mark it with the keyword “skip”. If I find out a month later that I actually do need to do the task, I haven’t lost any information.

Lastly, I like doing this in a Google doc, it keeps things really simple because there's no fancy UI to distract me. Furthermore, everything is tracked in the revision history.


]]>
Will Chen
tag:willchen.me,2013:Post/1057369 2016-05-30T05:21:42Z 2016-05-30T05:21:42Z Front-end Architecture

Outline:

  • Import "concepts" not "implementations" - encourage a pattern where people import a generic Component interface
  • Encourage compositions through decorator and mix-in patterns
  • Type safety as a first-class concern
  • Prefer fractal architecture (e.g. a big component is composed of smaller components)
  • Web standards over proprietary standards (e.g. use the normal DOM interface, make it compatible with web components)
]]>
Will Chen
tag:willchen.me,2013:Post/940886 2015-11-30T03:40:53Z 2015-11-30T03:40:53Z Review of "Large-scale Automated Visual Testing"

Just watched a video from Google's 2015 testing conference called Large-scale Automated Visual Testing. Incredibly insightful talk by a cofounder of Applitools, a SaaS provider for visual diff testing.

I heard of Applitools before when I was researching various visual diff tools for my team at work, and I was initially wary that the talk would be an extended informercial of Applitools' product. My concern was quickly proven wrong. It's an incredibly informative talk filled with numerous examples and demos to demonstrate various tips he has for doing visual testing in an efficient and effective manner. I was actually blown away by the demos of Applitools and how effective they were at identifying "structural changes", that is substantive changes to a website / app, and being able to ignore minor differences between browsers or dynamic content that changes (e.g. article blurbs that change each day).

I'm looking forward to trying out the free plan and seeing if we can incorporate Applitools into our team's continuous delivery workflow.

]]>
Will Chen
tag:willchen.me,2013:Post/940850 2015-11-30T01:01:21Z 2015-11-30T01:01:22Z Data normalization

Data normalization is one of those words that I've been intimidated of for a while. My initial reaction is that it's about making the data "normal", i.e. standardized, so you can't have some rows where the date is a timestamp (e.g. 14141231) and others where it's a string (e.g. "January 23, 2015"). I think that initial intuition was along the right tracks but data normalization seems to be more focused on making sure no particular piece of data is stored in more than one place. Essentially, if I can boil it down, data normalization is about having a "single source of truth" for any given piece of information (e.g. Bill Clinton's date of birth).

There are three forms of data normalization that each build on each other, with the second being more strict than the first, and so on. The examples in the wikipedia page were actually very easy to understand and I highly recommend skimming through the pages and reading through the examples:

https://en.wikipedia.org/wiki/Database_normalization

https://en.wikipedia.org/wiki/First_normal_form

https://en.wikipedia.org/wiki/Second_normal_form

https://en.wikipedia.org/wiki/Third_normal_form

I initially got interested in what "normalization" meant, when Dan Abramov mentioned his library normalizr, which normalizes nested JSON data.

As a business analyst in my last job, I think this notion of de-duplicating data is second nature and storing the same piece of information in multiple places is the bane of any data analyst managing a complex Excel workbook. For example, sometimes we had to build an Excel model really quickly and take some shortcuts. Later when our boss would ask us "what would be the impact if factors A and B were adjusted by 5%?", it wouldn't be as simple as changing a single cell in one place. The difficulty would be in remembering all the places where you would need to manually update the data. Of course, as you get better at Excel modeling, you would utilize cell references as much as possible, and try to consolidate all the various inputs ("levers" in consulting-speak) in one area, ideally the first worksheet of an Excel file.

]]>
Will Chen
tag:willchen.me,2013:Post/940847 2015-11-30T00:49:56Z 2015-11-30T00:50:24Z Reading lists

I just purchased several ebooks from Prag Prog because they have a Cyber Monday discount, and I've decided to have a moratorium on purchasing new ebooks until I finish my "to read" list.

Recently finished reading:

  • Leading the transformation: Applying agile and devops principle at scale
  • The Halo effect
  • Debugging teams
  • Hooked: How to build habit-forming products
  • Learning Agile

Currently reading:

  • Innovator's Dilemma
  • Lean Product Playbook

On break from reading:

  • Crossing the Chasm 
  • Service design

Currently on my reading list:

  • NoSQL Distilled (read ~half)
  • Designing data-intensive applications (still being written)
  • How to solve it
  • Reactive Programming JS
  • Predicting the unpredictable
  • Release it!
  • Beyond legacy code
  • The Go programming language
  • How Linux works
  • Functional programming through lambda calculus
]]>
Will Chen
tag:willchen.me,2013:Post/940780 2015-11-29T22:32:24Z 2015-11-29T22:32:24Z DevOps Tools

Tapiki is a production monitoring / debugging tool for the JVM and they've written up a very detailed multi-part guide on various production tools. I really like how they separate out the tools into various categories so you can see which tools are in the same "solution space" (e.g. you probably don't need to use more than one of them, unless there's a really good reason).

http://blog.takipi.com/the-definitive-guide-for-production-tools-24-ways-to-see-through-your-application/

]]>
Will Chen
tag:willchen.me,2013:Post/940778 2015-11-29T22:28:29Z 2015-11-29T22:40:14Z Logging stack

This is just a quick post to summarize my thoughts on logging and monitoring. I have spent a bit of time now for my particular product at OpenTable to do logging and monitoring and I've quickly realized that it's a pretty deep topics.

I've included some resources below, mostly things that I've found helpful or seem interesting:

Source: Digital Ocean on ELK stack

Paid solutions for using StatsD:

  • There is a hosted Graphite / StatsD service that seems to have an affordable entry plan ($19/mo): https://www.hostedgraphite.com/hosted-statsd
  • Scout: https://scoutapp.com/signup

Paid solution for the "ELK" stack:

  • elastic, the company behind the three open-source projects of the ELK stack, has a SaaS offering for Elasticsearch (which can be tricky to operate especially as you scale): https://www.elastic.co/found/features
]]>
Will Chen
tag:willchen.me,2013:Post/940764 2015-11-29T21:36:25Z 2015-11-29T22:43:27Z Learning resources for December 2015

To do list for learning:

]]>
Will Chen
tag:willchen.me,2013:Post/940699 2015-11-29T17:50:49Z 2015-11-29T17:50:49Z Indepth guide on error handling

Node.js has long had a reputation of being difficult to write error handling for, in large part because of it's async, evented-io model. Joyent has a very indepth article on error handling:

https://www.joyent.com/developers/node/design/errors

]]>
Will Chen
tag:willchen.me,2013:Post/940557 2015-11-29T07:49:59Z 2015-11-29T07:51:00Z Feedback driven development workflow

Today I spent some time working on two toy projects to get a better understanding of Typescript. My primary goal was to make a slack bot that could answer common questions on git. I decided to first make a blackjack app in Typescript because I had only briefly used it before and I wanted to have a quick refresher on the major concepts of Typescript in a domain that's very familiar (e.g. blackjack / playing cards). 

For me, learning the actual typescript type syntax hasn't been bad since I dabbled briefly in Go and I've been reading a bit on Typescript and Flow Type. There was a bit of learning curve just figuring out the dev workflow for using Typescript since it means you need to transcompile before you can run your app. I used Visual Studio Code since it offers a really nice balance of the core benefits of an IDE (e.g. intellisense + debugging) with the speed and ease of use that lightweight text editors such as Sublime offer.

Setting up Visual Studio Code

If you use Visual Studio Code to compile your typescript files, you need to create two files in your project:

  1. tsconfig.json (https://github.com/willchen90/typescript-blackjack/blob/master/tsconfig.json)
  2. .vscode/tasks.json (https://github.com/willchen90/typescript-blackjack/blob/master/.vscode/tasks.json)

Then you can run your build task within VS Code and it automatically watch - you can see the results in the bottom left corner. The two issues that I ran into is that: 1) you need to re-run the watch task when you add new file (it looks like this will be solved in Typescript 1.7.X - https://github.com/Microsoft/TypeScript/pull/5127) and 2) you don't know when it's "done" compiling since the watch task never ends, although the Typescript compiler seems to run very fast so that wasn't really an issue.

TSD - Typescript Definition Manager

The other thing I discovered is this tool called TSD (Typescript Definition manager) which is basically a package manager like Bower for typescript definitions (it seems to be a flat dependency structure, although I didn't dig in too deeply on this today). This makes it much easier to add typescript definitions as you essentially only have to manage one Typescript definition file from your application code (typings/tsd.d.ts). The main commands are "tsd init" and "tsd install lodash --save". 

Note: there seems to be a bug where if you include the flag before the package name, the command isn't executed properly. (e.g. tools like npm don't care if you do "npm install --save lodash" or "npm install lodash --save").

Starting with the Slack chatbot client

Initially I was hoping to just run the chatbot client against the actual Slack API using Slack's somewhat supported node.js client (https://github.com/slackhq/node-slack-client), however I quickly ran into the rate-limiting issue (HTTP 429 - Too Many Requests). It seems like Slack has a pretty conservative rate-limiting policy of one message per second. I'm not sure if there's a way of "pay to play" to raise the limit or Slack really dislikes automated messages. 

Making a mock "chat client"

As a workaround I used a "mock" chat client using node.js standard input and standard output interface using the readline npm module. The key to doing it was to isolate the slack client-specific code (which I had essentially copy and pasted from slack's example file) with the rest of the code I was developing.

It was actually a really simple implementation and could be reused for a variety of apps. The next issue I wanted to solve was not having to manually restart the node.js app everytime I made an update. Of course it's not that much work, but it's annoying to have to remember to do everytime so I used an npm module called nodemon. It's very popular for local development because it restarts your node app whenever it detects a file change. If you're creating a REPL-like app (e.g. a chat client), you want to make sure you set the "restartable" flag to false, otherwise nodemon will listen to stdinput and you will get undesirable behavior like repetition of stdinput. It wasn't really clear what caused this from the Readme, but I figured it out by looking at a similar GitHub issue.

Debugging!

For some reason, using the debugger seems to be pretty uncommon in node.js land. I think it's a combination of most JS developers using text editors (e.g. Sublime, Atom) without debugger support and that debugging transcompiled code (e.g. Coffeescript, Typescript) is oftentimes a pain. Luckily with source maps and new tools like Visual Studio Code, it seems like debugging is now a lot easier and actually fun to do. My recommendation for using the debugger in Visual Studio Code is to rely on using the "Attach" setting, which is essentially hook into / debug a node process that you've already started. I've included my example VS Code configuration. This is usually more straightforward than trying to launch a new node process through an IDE.

Unit testing

Eventually it just got too tedious to manually check outputs, even with the mock client. I created a small suite of unit tests using Mocha and Chai, which I was familiar with. Mocha is a very popular framework with a helpful, easy to look at website. The two tips that I have are: 1) use the watch flag (it's like nodemon for testing) and 2) using "source-map-support" npm module so your error stack traces point to your original source files, not the transcompiled .js files. For example of these two tips in action, look at my simple one-line "npm test" script.

I even launched the debugger in Mocha. Use the "--debug-brk" and not the "--debug" flag, otherwise you won't be able to attach your VS Code debugger to the Mocha test process. 

Wallaby.js - unit testing on steroids

Lastly, I want to mention Wallaby.js, which displays unit test results in your editor. It's looked very promising for a while, but I had some trouble using it in WebStorm a while ago (it seemed like the test results didn't update properly). I decided to give it another go since they just launched a Beta for Visual Studio Code (which recently open sourced their codebase and have developed an extension system). I only briefly played around with it, but it seemed quite reliable and I really enjoyed having the console.log information display at the bottom bar, and the code coverage so prominently displayed while you're coding. In essence, Wallaby.js seems like a next-generation testing tool. I'm going to spend some more time with it, and try to use it regularly at work.

To conclude: get more and faster feedback

Even though these were two really short projects (and they're incomplete), I've learned a tremendous amount just from exploring Typescript and all these other tooling that play well with Typescript. I was initially worried that using Typescript would slow me down because 1) I wasn't too familiar with it and 2) it would be time-consuming to write type annotations. In the end, I think those concerns were proven false as I was able to be very productive with Typescript in a short amount of time. Getting errors from Typescript within VS Code was a huge help, and I was able to catch silly mistakes (e.g. typos, logic errors, etc.) in a very quick cycle. I think in the future, I will always consider using Typescript if I'm starting on a new Javascript project. The only downside of Typescript is that it takes a bit of setup to get a smooth workflow and some of the tooling lags behind Javascript (ES6) (e.g. linting, style-checking). However, I think those downsides are far outweighed by the benefits you get from it, and it feels like the Typescript ecosystem is alive and well from the open and active development by Microsoft on their typescript GitHub repo to the new Angular 2 framework that is being developed in virtually all Typescript.


Link summary: 

Tools that I used today:
  • Typescript - (to install the compiler: npm install -g typescript)
  • Visual Studio Code - https://code.visualstudio.com/
  • Mocha & Chai - https://mochajs.org/
  • Nodemon - https://github.com/remy/nodemon
  • Wallaby.js - wallabyjs.com
  • Source map support - https://github.com/evanw/node-source-map-support

Toy projects:

  • Blackjack - https://github.com/willchen90/typescript-blackjack
  • Git chatbot - https://github.com/willchen90/typescript-gitbot
]]>
Will Chen
tag:willchen.me,2013:Post/940061 2015-11-27T18:22:37Z 2015-11-29T07:51:52Z Test coverage doesn't matter?

tl;dr - it's unclear that very high test coverage is better than "ok" test coverage.

]]>
Will Chen
tag:willchen.me,2013:Post/922195 2015-10-26T04:32:55Z 2015-10-26T04:32:55Z Excellent talk on transactions, databases, and distributed systems

]]>
Will Chen
tag:willchen.me,2013:Post/919097 2015-10-19T03:01:51Z 2015-10-19T03:01:51Z Linda Rising

Interesting perspective on the people side of the software engineering business:

]]>
Will Chen
tag:willchen.me,2013:Post/917780 2015-10-17T01:52:44Z 2015-10-17T01:52:44Z Git workflow

https://github.com/tj/git-extras

]]>
Will Chen
tag:willchen.me,2013:Post/915683 2015-10-12T02:46:29Z 2015-10-12T02:49:25Z Emerging new technologies I'm watching

Like most developers, I always get excited about new, up and coming open source technologies. In the spirit of Thoughtworks Radar, albeit much more casually, I'd like to list some technologies that I'm think are quite interesting and worth monitoring.

Reputable technologies - Significant use in production and established communities

  • Docker
  • Kubernetes
  • React
  • Vagrant
  • Node.js + Express
  • Typescript

Rising star - Worth experimenting / prototyping with these technologies

  • Meteor
  • Relay + GraphQL
  • Falcor.js
  • Flow Type (by Facebook)
  • Deis - Heroku-like open-source PaaS
  • Angular 2
  • React Native
  • AWS Lambda
  • Babel.js - transcompiles future JS into ES5
  • ES7 - async, await
  • Redux - new, but popular flux-like architecture and tiny footprint

Moonshots - These are new, unproven technologies but have a potentially large benefit.

  • Black screen - terminal emulator built in Electron
  • Otto - created by the makers of Vagrant, provides a simpler workflow through more "magic"
  • Koa.js
  • Elm - functional programming made easier (Haskell that transcompiles into Javascript)
]]>
Will Chen
tag:willchen.me,2013:Post/915680 2015-10-12T02:32:11Z 2015-10-12T02:32:11Z Deis - Platform as a Service (PaaS)

I've heard a bit about Deis for a while but I never quite understood how it compared to other open source technologies like Kubernetes. This Stack Overflow answer was quite helpful: http://stackoverflow.com/questions/27242980/whats-the-difference-between-kubernetes-flynn-deis

Especially insightful was the comment by the founder of Deis, with a link to this technology stack diagram:

https://pbs.twimg.com/media/B33GFtNCUAE-vEX.png:large

]]>
Will Chen
tag:willchen.me,2013:Post/902880 2015-09-09T06:18:36Z 2015-09-09T06:18:36Z Guidelines for avoiding thinking pitfalls
  • People overestimate the impact of key individuals on an outcome (e.g. entrepreneurs, CEOs, politicians) by failing to account for other factors such as organizational culture, industry trends, economic climate, and so on.
  • External factors (to an individual) can be considered luck - and they are very important.
  • Linear regression when used properly is a simple but powerful tool for isolating the effect of something - however it only measures correlation not causality.
  • Causation is much harder to prove than correlation. Experiments (e.g. a/b testing) are ideal but not always practical, particularly for macro decisions (e.g. new product launch, M&A)
  • Complex outcomes (e.g. 5-year company growth, civil war in the Middle East) oftentimes have multiple causes and defy simple explanations.
  • Plans are too optimistic because people do not account for potential unknown unknowns.
  • Improve estimates by using reference points from similar cases. Modify your estimate based on these references by acknowledging differences from your current case with previous cases.
  • Consider reverse causality. Does good company culture lead to good financial performance? Or does good financial performance help foster a good company culture?
  • Consider confounding factors. Is there a third variable that affects both the independent and dependent variables you are looking at?
  • Look for probabilistic distributions. Things are rarely surefire and once in a while failure is inevitable.
  • Is there a worst case scenario that you haven't thought of yet? How undesirable is it? Is it so bad (e.g. the Great Recession) that once is one too many times?
  • Does the organization as a whole not take enough risks due to incentives for individuals to avoid taking risks?
  • Check for survivor bias. Are there instances that should be considered but are not because they are not well-known / out of business / etc?
  • Is the halo effect causing you to subconsciously use the company's good financial performance to attribute positive beliefs on other aspects of the company (e.g. strategy, culture, people, etc)?
  • Am I considering a point in time (e.g. cross-sectional) vs over time (e.g. longitudinal)? Studies over long periods of time are ideal. Asking individuals to recall events after the fact is likely to be affected with hindsight bias (i.e. risky decisions are seen as foolish / prescient given how things unfolded).
  • People overestimate their ability to assess other's skills. Quantitatively assessing interview scores with post-hiring job reviews allows you to improve the hiring process in a systematic way.
  • When analyzing the root cause of a failure, consider whether it was the failing of an individual (e.g. that person was irresponsible and should be fired) or the failing of a system (e.g. that mistake could have happened to any of us).
  • Self-evaluate what lens / filters affects your perspective.
  • Make decisions based on expected value when possible - it will slowly add up
  • Use multiple disciplines when possible - think quantitatively and qualitatively.
]]>
Will Chen
tag:willchen.me,2013:Post/902869 2015-09-09T05:53:24Z 2015-09-09T05:53:24Z Notes on Thinking Fast and Slow and The Halo Effect

Thoughts from reading the books Thinking Fast and Slow by Daniel Kahneman and The Halo Effect by Phil Rosenzweig.

  • Humans are prone to cognitive biases which can lead to suboptimal decision making. Many of these biases are driven by our desire to make sense of what is happening around us and to create stories that logically explain major events.
  • Halo effect is the tendency to attribute traits (independent variables) based on outcome (dependent variable). For example, if a company has several years of growth, its CEO will be praised for being pioneering and aggressively entering an adjacent market. In the counterfactual situation where the company struggled, the same CEO might be criticized for taking wantonly taking risks and neglecting the core business.
  • People, particularly experts and optimists, greatly overestimate their impact or understanding. Leaders of organizations, such as the President, CEOs, etc., are seen as responsible for the overall outcome of their organizations. Studies have shown that a better CEO is only likely to outperform his or her counterparts 6 out of 10 times (i.e. a 10% improvement from flipping the coin). CFOs of major companies were asked to predict the movement of the S&P 500 markets. Not only were they significantly off, they were overtly confident about their predictions.
  • People go for simplistic explanations because it's easier to comprehend and follow. Complex explanations such as "growing a business requires a careful analysis of the competitive landscape, assessment of potential disruptive innovations, and managing risks through a balanced portfolio" is harder to follow than a simple explanation such as "building a great company is about focusing on customers and empowering employees". The reason why the first is more complex is not that there's more steps - it's that it focuses not on concrete actions (e.g. promote employees within your organization to groom leaders), but on a mindset (e.g. assess various scenarios in how 3D printing can affect this industry). 
  • Survivor bias can lead us to idolizing risk-taking companies because we are not as aware about that failures.Rather than taking into account the risk inherent in big movies such as acquisitions or entering a new market, people focus on the successful stories. Michael Raynor wrote a book The Strategy Paradox: Why Committing to Success Leads to Failure that covers this issue.
  • System 1 is the lizard brain - it feeds off emotions and generates our gut reactions. System 2 is the higher order thinking brain. It can carefully consider tradeoffs, but it takes significant energy to use it, so we conserve energy by not using it whenever possible.
  • The framing of a choice makes a big impact in how people respond -- even professionals (such as doctors about medical decisions). People consistently are risk-averse when given a choice of a guaranteed win (100% to win $500) over a very likely win (80% chance to win $650). People are risk-taking when given two bad choices and have a chance to entirely avoid the bad outcome (although with a smaller expected value).The prospect theory helps explain these decisions by focusing on the gains and losses vs. the final state, which is what expected utility theory focused on. Prospect theory also emphasizes the importance of the reference point, which is typically the previous states.
  • The two selves are the remembering self which is how you think about your memories and what you use to make future decisions and the experiencing self which is how you feel at a given moment.
]]>
Will Chen
tag:willchen.me,2013:Post/902374 2015-09-08T04:04:34Z 2015-09-08T04:06:19Z The value of values

Excellent talk by Rich Hickey. Even if you never use Clojure, it's an enlightening talk about the benefits of immutable data that's applicable to everyday programming:


]]>
Will Chen
tag:willchen.me,2013:Post/900883 2015-09-03T08:39:40Z 2015-09-03T08:39:40Z Focusing on the core technical problems

This post is a brainstorm of ideas on how to focus on the core technical problems and address solving supporting technical issues as efficiently as possible. Of course this makes some assumption of what is your "core problem", but I will assume this is a "typical consumer web / mobile app". If you're a devtool company such as Docker, for example, then your core domain and supporting domains may be completely different.

  • Follow the mantra "borrow, buy, or build" (I may have re-ordered it)
    • Ideally, there is a well-supported open source project that solves the problem that you can use. I think examples of this are Docker, Angular, Meteor, and so on. These are all very different technologies that but they all have in common that: 1) large, active communities around them (which means you will have support when you run into issues), 2) many examples of them used significantly in production, and 3) strong core teams behind the projects (in this case, each of them have a corporation backing them)
    • Secondarily, if there's a commercial product or service that addresses your need, then strongly consider using it. This is a case where you may end up using Docker the open source project first, and then realize it's very useful to use something like Docker's registry for private repos.
    • Lastly, there is building your own solution. Sometimes, this means using an open source project like Docker registry but hosting it on your own physical servers or even putting it on AWS. This last one is typically about either 1) saving money (since you don't have to pay the premium pricing for a ready-to-go solution by the official provider) or 2) maximizing flexibility (e.g. add additional features, being able to switch vendors, etc.)
  • Use a managed service for storing persistent data
    • Use one of the big 3 cloud service providers' (AWS, Google Cloud Platform, Microsoft Azure) proprietary solutions (e.g. Aurora for AWS). You will probably pick one of the 3 providers for the majority of your cloud services anyway:
      • Potentially may be the most powerful solution since AWS can optimize the hardware, networking, specifically for that DB product. The biggest downside is vendor lock-in. If AWS becomes more expensive relative to others in the future, it may be difficult to switch.
    • Use a smaller cloud service focused on DBaaS such as Compose.io (although it's now part of IBM) or Heroku's PostgresSQL service.
    • Databases can take quite a bit of work to setup, maintain, backup, etc. It's also the least forgiving part of your tech stack - mistakes can sink a business here. What you're really paying for, IMO, is 1) convenience - so you don't need to hire a dedicated sysadmin, 2) maintenance & backups - doing routine patches, nightly backups, essentially all the maintenance tasks that ought to do but might not know or remember to do, and 3) someone to call if something goes deeply awry (this will usually cost the most in the form of an add-on enterprise support plan).
  • Use a widely-used UI framework
    • Wide usage is very important because that ensures most of those nasty browser incompatibility bugs gets caught by the community and not on your product.
    • While many people worry about having a cookie-cutter theme (especially with Bootstrap), in practice if you customized the variables, e.g. font-faces, color scheme, base font size, etc., you can still have a unique UI look for your product.
    • The most interesting UI library out there is Semantic UI which uses a natural language approach and focuses on making it very easy to customize your site by swapping / combining style guides / themes.
  • Use a container management service
    • Containers have moved from the point of hype and have seen widespread production usages in many companies. Since the early days of containers, there's been many approaches to managing them, from using basic shell scripts to using traditional configuration management tools (e.g. Chef, Puppet, etc) to powerful resource management frameworks like Mesos.
    • The most popular open source project seems to be Kubernetes (backed by Google which provides Google Container Engine) which was based on a decade plus of experience using containers by Google (from their legacy system Borg).
    • In theory, Kubernetes can be run on any cloud service, but I think Google Container Engine will provide the easiest setup (when I tried it a while ago, it was not trivial to setup in Azure, even with all the instructions).
    • Amazon has their own Elastic Container Services, which I don't know too much about, but it's a proprietary solution so there's concerns about vendor lock-in. It seems like Kubernetes has support from virtually every significant cloud player except for Amazon.
    • Use a container registry service
      • If you do go the container route, you will need to store all the Docker images in a Docker registry. (The size of all the images can add up quite quickly if you want to keep old versions).
      • Docker and Google offer paid services.
  • Use an event tracking service
    • Services like Mixpanel are incredibly useful and provide better reporting than most homegrown solutions.
  • Use a monitoring / logging service
    • New Relic has a proprietary solution that seems quite advanced (focuses on monitoring availability + performance) - it's pricey but there's a free tier.
    • Many companies like to use ELK (Elasticsearch, Logstash, and Kibana) for monitoring.
  • Use a CI / CD service
    • The most established services in this space are Codeship, Circle CI, and Travis CI (which has mostly focused on open source projects and recently offered private project services).
    • Shippable is a relatively new comer and focuses extensively on Docker / container-related capabilities. Their pricing also seems like the cheapest at $10/container.
  • Use pager duty
    • Not sure what's the next alternative to PagerDuty, but they do a great job of creating a straightforward application that does what it sounds like.
  • Use a code repository service (e.g. GitHub)


Other services:

  • Translation management service
  • Visual diff - not aware of any services / well-supported open source project for this area
  • Task management (Asana, Hansoft X)
  • Cross browser / device testing (Sauce Labs)
]]>
Will Chen
tag:willchen.me,2013:Post/900869 2015-09-03T07:48:58Z 2015-09-03T07:48:58Z If I was doing a project, I would use...

If I wanted to get something done as quickly as possible... I would use Meteor because provides the most complete solution (e.g. "full-stack" framework). It greatly simplifies tedious tasks such as setting up authentication, ensuring real-time updates synchronize properly, deploying to the cloud (Galaxy should soon make it easy even for production deployments), and building for multiple platforms such as Web, iOS, and Android.

If I wanted to make a small-scale application (e.g. internal app)... I would use Meteor because it's relatively well-used in small scale apps. If I don't expect the number of concurrent users to be that high, Meteor could likely work out of the box with relatively few performance tweaks.

If I wanted to make first-class desktop & mobile web apps... I would use React and React Native and share as much of the business logic together (perhaps using a Flux-like architecture like Redux). If performance on mobile web app was less stringent, I would consider evaluating using Meteor. Hansoft X is building a first-class desktop & mobile web app using Meteor and they used a few interesting tricks to ensure the mobile UX was optimal (read their blog post for details).

If I wanted to make a simple / relatively static website... I would avoid coding and use an out of the box solution like Wordpress + template or a full service solution like Squarespace. These tools are really easy to use and the benefit of hand-coding a static site is pretty minimal unless you're looking to create a very unique UI / UX (e.g. April Zero).

If I wanted to create a large-scale (in terms of user, complexity, team) rich web application... I'm not sure what I would use. I would first consider React + Flux + whatever on the backend (probably Node.js + Express unless there was legacy requirements). If there's real-time requirements, (e.g. seeing real-time messages from other users), I think using Meteor and investing in significant resources on scaling / load testing / etc would be the right solution. As Geoff Schmidt, co-founder of Meteor, noted, there's no easy, "off the shelf" way of building an application at "Facebook scale" without a lot of engineers working on the problem. 


Some random thoughts:

I think a year or so from now, when Angular 2 has been released and is used for production apps, it could be an interesting framework to use. For me, the benefit of a framework like Angular 2, is that it's borrowed some of the best ideas from React (e.g. unidirectional data flow) and combines it with a powerful set of supporting tools (e.g. animations, material UI components, testing tooling, etc.).

In general, I think Meteor has a lot of great benefits, many of which I've highlighted above. If I had to pinpoint, my key concerns, it would be around:

  • Integrating with legacy application / services - while Meteor is composed of many components (e.g. build system, packaging, minimongo, etc.) that are in theory swappable, in theory Meteor is almost always used as a full-stack solution and there's not a lot of examples of people using parts of Meteor and integrating it with legacy systems. This isn't to say you couldn't do it, but so much of the community / documentation is focused on using the whole Meteor stack that if you run into issues swapping out the Meteor build system with gulp because the rest of your company has a shared gulpfile for example, then you may have some lonely times debugging problems.
  • High scalability - If you know your application will have "Facebook scale" user requirements (which is highly unlikely, unless you're already working at Facebook / Google / tech company with huge user base, and in that case you probably have an in-house solution you can use anyway), then Meteor's lack of track record with very high scalability may be worrisome. However, to be honest, it's unlikely that your application will reach that scale. And the truth is, any system that you will create initially will need significant rewriting / finetuning to get it highly scalable.
  • Testing - to be honest, I would have expected Meteor to have a core / official testing framework by now. One of the best parts about Angular was that it focused on testability from the get-go and spawned fantastic testing tooling like Protractor and Karma. However, they announced they're going to move Velocity, the official community testing framework, into core, so this will get resolved eventually.
  • Hosting - I think once more information about their Galaxy service comes out, any concerns on this topic will be addressed. For now, I've read anecdotally that it's non-trivial to deploy a Meteor application, especially with scaling it out horizontally.
In short, I think if you're starting a new project, particularly as a startup, I think the benefits of Meteor -- namely, you can move fast and focus on your core domain -- outweigh the disadvantages.
]]>
Will Chen
tag:willchen.me,2013:Post/896906 2015-08-23T23:04:45Z 2015-08-26T03:42:45Z End of 2015 learning plan

With the last four months of 2015, I'd like to do the following:

  • Write many small applications in Go
  • One big JS application
  • One dev tool (hotspot analysis)
  • Read more paper books
  • Read the following kindle ebooks
    • How to solve it
    • How Linux works
    • An introduction to the programming through lambda calculus

Specific example ideas:

  • Simple Go web scraper
  • BART real-time estimator Go App
  • Relay app - measure daily activities

Tutorials:

  • https://egghead.io/series/mastering-asynchronous-programming-the-end-of-the-loop
]]>
Will Chen
tag:willchen.me,2013:Post/896904 2015-08-23T22:55:29Z 2016-07-05T05:20:16Z Hammock Driven Development

Rich Hickey gave a very interesting talk on problem solving titled "Hammock Driven Development". Rich took a much more personal (experiential) angle with this talk, so while it was sparse on academic research references it does seem to fit into at least some of what the research on brain science says. Still, while it may not wholly scientific, it was still a very thought provoking talk.

Takeaways:

  • Don't code right away. Instead, start with...
  1. Identify the problem you are solving. The purpose of software engineering is to solve problems, and incidentally build features to solve them. Rich advocates stating the problem out loud with your team, or even better write it down (which is pretty much applicable to every step after this as well).
  2. Understand the problem fully.
    1. Facts - what do you know about this problem? (e.g. what are the business requirements?)
    2. Context - relevant background information. (e.g. your team uses these OS / frameworks, and there will be less friction to using something similar)
    3. Constraints - implicit hard requirements (e.g. must be able to scale to 10,000 concurrent users; must be 99.99% available)
    4. Are there things you know you don't know? (e.g. where is the data coming from? what happens when this API service goes down?)
    5. Are there solutions to similar problems? (e.g. an open source library addresses a related use case)
  3. Deeply explore potential solutions
    1. What are the problems in your potential solution?
    2. What are the trade-offs? (everything has a tradeoff)
    3. Question marks (what do you not know yet?)
  4. Flip between consuming lots of input and no input
    1. Feed your Background mind: Consume lots of input (read articles, look up related problems, and critique their solutions)
    2. Let your Background mind absorb: No input, just meditate, and focus on what you were feeding your mind
  5. Get plenty of sleep - your brain is doing important work!
    1. Get at least a night of sleep before you make an important decision
  • Take the time to focus on things. You will inevitably "drop the ball" on certain things, which is OK, but tell the people important to you that you are focusing on something.
  • Don't lean on the Feedback loop during development (e.g. TDD) - it's important and you will iterate it, but rely on it instead of focusing on the previous steps.
  • Don't be afraid to make mistakes, requirements will change, you will come up with a better idea... it's OK

    https://www.youtube.com/watch?v=f84n5oFoZBc

    ]]>
    Will Chen
    tag:willchen.me,2013:Post/887864 2015-08-23T22:38:35Z 2015-08-23T22:38:35Z Getting Better Sleep
    • Read a book for at least 30 minutes - (better if it's dry)
    • Meditate for 10 minutes
    • Write down any thoughts to the iPad for 15 minutes
    ]]>
    Will Chen
    tag:willchen.me,2013:Post/896901 2015-08-23T22:38:10Z 2015-08-23T22:38:10Z What I don't know

    Improvement areas:

    • Domain modeling - Improve OO programming
      • Understand the requirements of a domain before starting
      • Create the lowest-level (smallest, least powerful) domain model and start defining its properties and characteristics
      • Work you way up (solving the main problems of the domain) and pushing the abstractions to the lowest layer possible
      • Understanding tradeoffs of instance lifecycles - e.g. using a singleton vs. creating new instances
    • Back-end development - Become familiar with Go, Java, and Python
      • Understand backpressure
      • Multithreaded vs. single threaded
        • Mutex (mutual exclusion)
    • Security & Auth -
      • Learn more about OAuth
      • Tokens - when do they expire? How do you revalidate / get new tokens once the old ones expire?
      • Common security loopholes in webapp
        • XSS (cross-site scripting)
        • CSRF (cross site request forgery)
        • DDOS (distributed denial of service)
    • Low-level development
      • Memory management
        • GC (ref. counting vs. mark and sweep)
        • Heap vs. stack
      • Javascript VM in the browser
        • JIT, inline optimization
    • Fundamentals of CSS / Sass - Understanding basic CSS / Sass concepts and being able to figure out advanced issues on my own:
      • Box sizing model (e.g. border box)
      • Floats (when you use them, why it's a hack)
      • Specificity (is it !important, ID, class, element?)
      • Inline styles are prioritized over linked stylesheets? Does last linked stylesheets win when the specificity is tied?
    • Production monitoring & health
      • Understand what Elasticsearch, Logstash, and Kibana do
      • Creating dashboards in Kibana / Grafana
    ]]>
    Will Chen
    tag:willchen.me,2013:Post/896898 2015-08-23T22:18:31Z 2015-08-23T22:18:31Z CAP: 12 years later (by Eric Brewer)

    Partially read through this article, very interesting - particularly on how Eric debunks the "pick 2 of 3" myth as that ignores the granularity of decisions / trade-offs developers can make:

    http://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed

    ]]>
    Will Chen
    tag:willchen.me,2013:Post/893892 2015-08-15T05:09:06Z 2015-08-15T05:27:55Z Learn By Writing

    I think one of the things I've realized myself falling into is spending too much time watching videos, reading articles and books, and not enough on actually practicing the basics of just writing a simple app, using a new technology (library, framework, language, etc).

    Learn Meteor by example:

    ]]>
    Will Chen
    tag:willchen.me,2013:Post/893889 2015-08-15T04:58:11Z 2015-08-15T04:58:11Z State of the Union on Javascript

    For a summary: http://info.meteor.com/blog/javascript-state-of-the-union-with-meteor

    ]]>
    Will Chen
    tag:willchen.me,2013:Post/891470 2015-08-08T06:12:15Z 2015-08-08T06:12:15Z Reading List for Golang


    ]]>
    Will Chen
    tag:willchen.me,2013:Post/888066 2015-07-30T03:59:59Z 2015-07-30T03:59:59Z Tour of Go Lang

    Completed first four of six modules today.

    To do:

    ]]>
    Will Chen