Sunday, April 18, 2021

Another hit and miss

 

Well, the Golang project is over.


In over fifteen years, this is the first time in my life I'm given the option to "switch languages or leave the company."


What happened.

As you know from my previous post, the company used rails / a mixed Nodejs / Typescript service and everything in Heroku.


The moment my team joined, the only developer lasted a couple of weeks, and he left.

What I didn't know is that I would be repeating his steps. 

I feel like I've been in a loop over these ten months.


With the original developer out, I felt that he was overworked, really severely overworked. He didn't know the entire application, but he knew the neuralgic points.


The stack of Heroku + Shopify / recharge had several weak points.


Recharge

Well, it is awful. From time to time, we had to fix bugs in Shopify without even changing the code because customers would not see the payment done. However, we had not changed anything at all.


Usually, e-mails to Recharge take time to receive feedback.


I don't recall if the company was paying enough to receive better support, but I don't think that the answers were ok.

To give a better example, Recharge operates with API keys, like aws, for instance, and they have a quota.

The first time I contacted them, they provided me with the consumed data information.

The second time I contacted them for a similar issue a couple of months later, the support person didn't want me to provide that information.


Shopify

I guess that e-commerce is an alien world for me. I've never done it, and this is my first time.

During that time in that company, there were several conversations about e-commerce applications that started and were a complete failure, others not, like Shopify.

I see the appeal for mom-and-dad shops, fancy small fashion design studios.

There are "risky" points like setting up a domain, but you can delegate that to the Shopify team for a fee.


What Shopify managed terribly wrong were the platform API changes.

They have a versioning system that lets you know that you have to update your version, but what they did was different.

They announced that by a specific date in January, the payment plugins like Recharge will stop working entirely. You have to manage payments differently.

We never found what the change was, and we had to rush as I've never done in my life.


Shipstation

Well, a standard JSON API with a rate limit.

That doesn't offer a sandbox. You need to contact the business on their end, put a credit card because there is no real sandbox, and if your test order get's delivered, you get a charge.

What I can say, I believe that is a "system without a sandbox" pretty much clarifies were you are standing. Other than that, there is nothing to say about them.

 

My ex-team and myself

Since I was never part of the decisions, I don't have explicit knowledge of what happened.

As far as I knew, the company understood that Shopify was screwing the pooch, they didn't provide good answers.

We were missing team members. We were going to rush and use Stripe.

The plan was to release the V2 version.

A V2 deployment in phases plan was done, and everyone was on board, and they all fully understood what we were doing.

We hit the first milestone somehow correctly. I interconnected the V1 site with the V2 react site via GCP cloud functions and redis.

My director of operations usually had the pattern that he siloes the information, so you are completely blind. Since we've had the same results for the third time, I believe that he doesn't correctly communicate with the client or hides information.

During Christmas and New Year, I've worked both days.

My qa partner had been missing since ever.

Apparently, we had to inform him that we worked in V2, although he was present for four months in the stand-up calls.

Apparently, it never crossed his mind that he should ask what we were doing or how he was supposed to test things.


But the best part came from my director of operations.


It was our fault (the developers, not him) that we didn't inform the QA.

Apparently, another of our tasks was to do the work for the QA, too, and tell him how to test things and that he should be ready for the unspecified release date.


We were still rushing. My other tech lead developer has a more full-stack mindset, so she thoroughly worked in the backend and coordinated the front-end team.


I created the whole backend dockerized environments for the developers, created the pipeline of all the services, front and back, created the pipeline step that backed up the DB before deploying.


We both stayed late for one month and a half to absorb the quotas of business logic that our director of operation released from time to time and unblock the team.

And all the things that I now entirely regret doing due to the ending of this adventure.


On February 12, the director of operations orders us to deploy V2.

We informed him, "this is not tested at all".


He said that he was being pressured by marketing because they had a huge budget to spend on ads and couldn't hold it any longer.


Due to his pedantic nature, he dismissed our assessment that the code wasn't tested and said that four days would be enough.


I don't have to expand on the dangers of releasing untested code to production. Still, oh boy, untested e-commerce is unnaturally insane.


The business logic that apparently nobody knows in that company wasn't completed. We had to manually fix records and work 12 hours sometimes to resolve issues.


The whole plan to have a controlled release crashed and burned, and nobody cares about that thing at all.



We burned ourselves (developers) trying to keep that thing afloat. I've never reached this level of stress and despair.


The director of operations left or was let go.


We observed that the company had been seeing other staff augmentation companies.


They assured us that they were just talking, and obviously, you know where this is going.


When they let go of the director of operations, they told us that the company may switch to Rails / Nodejs because e-commerce uses that and not Golang. 


The team members were really vocal about this change, and nobody wanted it to happen.


They made me convince my other tech lead partner to say (she was going to quit), so we resolve that, and we move forward. I deeply and entirely regret asking her to do that.


We were at three weeks of completing the automation of the system, and they agreed.


But it was terribly in vain.


One week in the development of the "last phase," and they ask us to switch to V1 because they want to be stable (despite V1 being the most unstable application).


Following that rollback week, they call us to a meeting, where we were told.

There is a new leader that will be the leader of you all.

We will code in rails, and if you don't like it, you can leave the company.


Damn.


That weekend my stress almost killed me.

I had orthostatic hypotension.

I fainted three times in the bathroom. I hit my head three different times really bad, and I think that I may have fissured my right elbow.


The end of a contract.

We reverted the company to V1 as they asked, and now I'm out of a job.

It seems that again I'm going to be without employment for a good couple of months.

I've got to take care of my health, study for interviews again.

But what hit me the most is the time that I spend sitting on this computer, rushing to complete something that I was pretty much entirely dismissed at all.

No questions asked, no nothing, they simply offer me to change the language or leave.


I've designed the architecture, coded channels to handle events, integrated third-party systems (Shipstation), and now all that is in the can.


All my time spent in that is never going to be back.


I've only touched the surface of Golang and GCP, which I guess is a pyrrhic victory for me.


I'm getting old, my mother is getting old, and all the time I did not spend with her is not coming back.


All the time I spend it trying to do the impossible, like when I was twenty-something for a company that didn't even ask me what I thought about the design change.


Again, like at the end of 2017, I'm on my own, alone.


Saturday, October 17, 2020

Moved to GCP and Golang

It has been awhile since I've written something.

I've left the real state company, the CFO that joined was a real piece of work.


I've spent a couple of months without a job, I've been moving with the same team for 3 years now.

During that pandemic vacation time, I've spent time playing with Gitlab, pipelines, React, Redux,Tornado in Python and Heroku.

We started a new contract that is a series a company that the core is in rails / typescript and shopify.

We are moving towards GCP / Golang / Shopify, but there is a long road ahead.

While we prepare for that, I've been playing around with hacker-rank exercises.

I've been using Golang as a sad excuse to use the language, because there is some road ahead before we start with the company to implement Golang.

There are certain things that since I've been working with Python since 2012 that feel different.

One simple thing, there is no way to determine the position of an array element in an array. You need to iterate the array to know the position.

This may sound trivial, but when you've got huge collections, like arrays of 9 thousand elements, it is going to be slow.

I've been focusing mostly in a really down to earth exercise, New Year Chaos.

I honestly love the problem, it is so simple, yet as soon as you've got more test cases, you promptly challenge your previous solution, since either it is too slow or it does not complete in the expected time.

I think that is really a great exercise.


Without further comment, we are two months before end of year and we haven't our baptism of fire with Golang, so I'm going to keep on fighting with that hacker rank problem in my spare time.

Thursday, November 21, 2019

Ecs, CircleCI, NLB and Grpc

I finished a migration from Heroku to ECS using CircleCI orbs for AWS.
I detail the aspect of the migration and the peculiarities of the project.

CircleCI

 

I knew of CircleCI back from 2016. I just knew that it was a more accessible tool to configure than Jenkins, and the project I was working on already had it.
Back then, I did not pay much attention to it since my main task was developing.
Cue to mid-2019, the company I'm working on switches from Jenkins to CircleCI. They did not want to spend resources, time, or anything. They want to "have it working without spending time fixing it.".
They move to CircleCI, and everything is working without the problems we had with Jenkins.
This new project I was working on was using Heroku with a bash script.
We opted to use ECS, and we found that we had support for it.
The documentation for the ECS orb is not clear at all if you don't spend time reading the terms of CircleCI orbs.
There is no clear differentiation between the job and the workflow, and I spent days confused.

ECS

 

I already talked about ECS, three years later and I'm back with it. Things are pretty much the same as before, now we've got fargate, that I have not used, but the documentation on crucial aspects like green/blue deployment is tailored towards that, which sucks, but is what it is.
Things are the same, you create a cluster, you create services, and you place tasks on it.
The new things I'm using this year are;
  1. Green/Blue deployment (soon).
  2. NLB for GRPC (more on it in the GRPC section).
What I do have to recommend is that if you are planning on expensive IO operations and you are planning on using "t2.small" as the best machine, then stay away from ECS. You are going to have downtime because the IO consumes the CPU credits fast.
We've got a situation with a monolithic application that we won't spend time refactoring. Still, the section of the code that works with PDF's has a lot of IO code, some parts are abusing IO due to the library they are using, and some parts abuse IO in the monolithic application itself.

Grpc

 

Again with GRPC. In the same situation, learn a bit more about how to debug the server internally, how to force the channel closure.
The "options" parameter for the server and the channel on the client receive a list of tuples. It lacks documentation in Python, and you need to dig deep into the source for figure how things work.
Options that you can use are also missing in Python, so I ended up reading Go examples that gave me an idea.

NLB

 

I'm writing this document on 11/21/19, NLB does indeed work with ECS EC2 deployment type without any problem.


The crux of the problem

 

I am using ECS (EC2) with a network load balancer to have the GRPC server working.
We are using internal NLB because we are not exposing GRPC to the outside world since this is just for our microservices.
The main problem is that the NLB balancer does not balance the machines behind it. Once the NLB opens a connection, it keeps on reusing it, no matter if you close the channel in the client.
That cascades the problem that If I create an autoscaling group because the EC2 instance serving is degrading, I don't have a way to force that without having downtime, which indeed it sucks.
In theory, GRPC also offers to balance at the client level, but since I'm doing a deployment in ECS, I don't have an excellent way to fix the IP of the server since the IP will change after the first deploy.
I don't know, and perhaps I could opt for a strategy like placing an elastic IP and find a way to articulate this in ECS?.

I will write my findings if I ever find a solution to this problem.

Sources and links used during this research.



Wednesday, November 6, 2019

Reliable queue

During research for my current project, I found an exciting topic to research.
Instead of going with the usual solutions such as celery, RQ, rabbit, zeromq, for example, I discovered a post about reliable queues using just redis.

The solution itself is quite straight forward.
You push incoming lists to redis [l|r]push, [b]rlpop to retrieve, and put in a new list.
N workers consume that list, and you obtain your desired result.
The steps of the worker are simple. It has two steps.
Set a lock (setex) and run the task.
There is a monitor that evaluates the locks, and if a task does not finish, it pushes the task again to be rerun.

This pattern provides me with several good points to describe.
  • I can scale redis.
  • I can put up to n workers to consume.
  • The monitor can take care of tasks that do not complete, expire.
  • Messages can indicate how long they take.
  • If the worker crashes, the monitor can put the task in the queue again.
  • Tasks are in memory
The bad points I foresee
  • If monitor crashes and does not catch up with redis, it is a potential disaster.
  • If redis crashes, everything is lost. Need to work on this point

Besides those points, there is a fundamental point.
I'm running again in ECS, the most notable instance my client is allowing me to use are T2.small instances.
At one point, I was allowed to add A2.xlarge or A2.large instances, but the cost of adapting to arm exceeds the time I can spend on this.

Over the sprint, my concept moved to another sprint, but I'm spending my free time writing an implementation of this excellent solution.

Perhaps I'm talking ahead of time, but I'm excited to learn something new.

I'm writing this proof of concept here. Perhaps I'm overly optimistic, and I do not see the future problems, such as how redis scales works or how to synchronize well the tasks, or how to calculate I/O with time execution, which proves an important part here to keep parts oiled.

During this solution, the other fundamental part of my research involves using GRPC. The workers use a grpc client to place a message, so execution time seems trivial; we receive all the data and call the client.

To create random blocking time, I went with the classic Fibonacci calculation. During this implementation, I found the reduced formula to calculate Fibonacci, and it caught my attention and got a bit deviated in my implementation.
I rarely had the capacity in maths, I recognize that I'm not good, but the closed formula for Fibonacci as an alternative caught my attention, as an alternative solution to recursion is fantastic, but I'm noticing that if we calculate the Fibonacci of 100, results differ with the closed formula.

Either way, this is my weekend project besides others.
I doubt that anyone is reading this, but well, if this works, this is a log of operations.

Saturday, September 14, 2019

2017 - 2018 - 2019 update

A long-time has passed since my last entry.
Let's recap. I was stuck in a really awful contract in fin-tech.
I couldn't bear it anymore and got a better job.
Just about 3 months until the end of 2017, I joined Shiftgig in Argentina.
Better pay, serious architecture, a promise to learn more, a lot of developers, and a possible career in leadership.
And the company was sold in December 2017.
Jobless.
I spend Christmas and the end of the year without spending too much cash.
Small savings, big mistake.
Fuck, I was lost.

I spent less than a month and got a new contract with Thirstie.
Remote work, much more pay than Shiftgig.
0 hassle.
Things went well for a year, I don't have much to comment, I was supposed to become a leader again.
January 1st, 2019, break up with my girlfriend of 2 years.
February 15th, 2019, my tutor and director of operations in Argentina that was preparing me to be a leader has to hire a CTO.
CTO fires him when he visits us.
The promise to become a leader vanishes, the team members that were with me do not acknowledge that we have a leader, despite me being the supposed person.
August 2019, Argentina again with the your dollars are not your dollars. My income is jeopardized. They force you to convert your hard-earn money to peso, the national currency that is worth a roll of toilet paper.
I took most of my dollars from the bank account and put them in a safe location.
I pay taxes, I'm a "responsable inscripto."
I pay taxes, 35% of my annual income next year.
I made advanced payments due to my income, about 40k pesos for 5 months.
They nuked my income with this law.
I can't save no more. For the first time in my life, I had savings, enough to live for years without working. They force me into that devaluated currency they use and take most of what I win.

In Argentina, there is no way to pay what I earn in the States, they devaluate the peso, and you don't have a way to cancel your debt with the State. Boom, welcome fiscal problems.
The utter stupid ideas that I hear when they tell me this is not what it is is ridiculous.

The only 100% legal options are buying dollars, which they will forbid soon. The previous government, Cristina Fernandez de Kirchner (CFK for short) authoritarian party did the same. They "offered dollars" but you had to ask permission AFIP (IRS if you are from the States) to buy and most of the time, you couldn't. Politicians could get their shitty hands in dollars without problems though.

The other option is to get into FCI investment, which this week I read news about people that invested in that got scammed and lost.

Doing things properly and following the law in Argentina is stupid, dangerous, and counterproductive.

Paying taxes is counterproductive, the system is made to bankrupt you, no matter what you do.
I'm not a huge company, so I don't have the resources to defend myself against the State. I don't have a viable future in this country. Being a remote developer works when we've got free access to dollars. *We bring dollars into the country*, we don't move the dollars out.

August, I left Thirstie, nothing to learn, nothing to gain here.
I start working for Vero, will see what happens. At the moment I've got a nice contract, but the economic situation here is awful.

Future?.
Well, Alberto Fernandez seems to be the one that is going to win, and he comes back with Cristina Fernandez de Kirchner.
So this means rampant corruption, authoritarianism. 
Now they come back with a payback.
Welcome back the mindset that peso is good, the dollar is bad.
Forbid the entrance of external goods, block all imports, so you have the idiots here that buy the cheapest, shittiest Chinese material, slap a sticker that says "made in Argentina" and voila, the national industry is back!. Hey, that's important, because it creates jobs. I'm sure that after the protectionism is lifted, we will have the most advanced solutions for everything like it always has happened in this country. It's impossible to think that they will use the subsidies to get a metric shit-ton of money and become rich. No sir. They will invest in the development of my country. Yes, no concerns at all.
Macri was a total and absolute disaster. I've never seen someone fuck up so badly, crash, burn and take us down with him, well, except that he won't have problems.

Thursday, September 21, 2017

Fintech

More experiences

And this is a new entry on this forgotten blogger site I've created and barely update.
What I'm up to.... the microservices fiasco.
Long story short, the microservices was just a charade because they didn't have any offer to give me after the contract was lost.
That project grew up a bit until they didn't have more funds.
Then they put me in whatever project they could find for me, and that was a company that installed solar rooftops. A 4-page flask application with a lot of heavy trigonometry calculation with 0 documentation and full of callbacks with 0 documentation.

They had tests for everything, but unluckily for me, I was stuck with a specific case of an AutoCAD file that when it was going through the flow, it produced an error.

The code was complicated, hard to read, and I'm not useful in trigonometry or electrical things, so my ability to provide reasonable solutions was 0. That is why I decided to leave.

Now 

I received an offer to work with a fin-tech company. I was excited, finance was a sector I always wanted to get involved with, since I read test-driven development by example, back in 2009.
And let me tell you. I wish I had never fulfilled my dreams.
Reality is awful.

The only place for my TDD book in this company would be to put it under a monitor to lift up the monitor.
The company is a startup, they honestly don't give a damn at all for any kind of testing.
No unit tests.
No TDD.
No integration testing.
No pep8.
I've overheard them that they use threading. I asked, don't you have problems with the GIL?.
They answered me, "what is the GIL?".
That is not the answer I was expecting and this is not an elaborated joke.
And no, I'm not using hyperbole, they literally answered me that.

I don't like this new job.
The goal of this client is focused on UI, creating Facebook pages,
and literally doesn't care if one of his customers is purchasing stock options has an error.
What happens if the transaction never occurs or happens and you have an error.

The company hires US university students and doesn't pay them.
They lure them with one of those fucking internships where they get "recommendations" for their services.
This girl that they hired, she did excellent work, and the pieces of shit of my coworkers laughed about her.

Saturday, August 13, 2016

Micro services

Intro

And we are back.

After 10 years, I'm still working in software development.


What I'm working on.

Well, after what I believe to be a failed contract, my company decided that I should be researching microservices, AWS (ECS, EC2, SES, SQS, SNS), Elastic (soon to be replaced by mongo).

What is each acronym that I'm using there?

AWS: Amazon Web Services. Yeah, I'm still on AWS. Been using it on my previous employer (a company in Argentina, that had a contract with someone from the States), where I had some sort of devops / developer position.

ECS: This is an Amazon service that allows you to use Docker inside amazon EC2 containers.

Before giving you my explanation, without reading this article by Yevgeniy Brikman, I could have never ever set up my docker tasks in ECS.

Give it a try to that link first, is the best explanation I found, way better than the amazon documentation.

The main problem that I see with the amazon guide is that you don't have an ordered tutorial.
The article by Yevgeniy Brikman is excellent.

You have the documentation, but it doesn't follow a proper order.

The first thing you will need to have if you are going to use this is an EC2 instance of the type that is optimized for ECS, for example (amzn-ami-2016.03.d-amazon-ecs-optimized)
Amazon covers the containers on the documentation, but is not the first thing.
They start the documentation with creating private docker registries that you will need way later on.

SES: Is used to send emails using the SES amazon service. SES stands for Simple Email Service.
SQS: Amazon Simple Queue Service. A simple queue, self-explanatory
SNS: Amazon Simple Notification Service. This is used to deliver notifications.

Together, SQS + SNS = Like rabbitmq. Amazon does all the heavy lifting for you. Though it has some peculiar things. Due to high replication, there is a chance that depending on how you configure your SQS, one or more workers may receive the same message published by the SNS. They state that in the documentation, your workers must be able to handle properly if a message was already processed. You can handle the visibility of a message once is consumed with a timeframe that you can configure.

The other part of my highlight is going to be the SES part.
SES is like mandril (or MailChimp now). The main thing is when you start the service, you are in the sandbox mode. You will need to verify an email, and you will be only able to send emails to that verified domain.
You need to fulfill a request, so they allow you to deliver emails.
The most important part here is that they ask you how you handle bounces and complaints.
You can create an SNS topic that will receive the bounces/complaints/ deliveries, and you will hook it up with an SQS consumer.
All that, by clicking with the mouse.
Seems pretty straight forward, but they didn't put that on the page.
The main idea is that if you receive a complaint or a bounce, you just take it out of circulation (i.e., don't try to deliver emails to that recipient).

With which language?

Python 3.5.1 (yeah, I decided to use something newer)
I'm using uwsgi / flask, with the following extensions (used across some of the microservices)

  • flask-jwt
  • alembic
  • flask-sqlalchemy
  • greenlet
  • boto3
  • flasgger
  • swagger-ui
Flask-JWT: Is an extension to create JWT tokens, that will be consumed by a JS client in my case.
The extension has some gotchas, you need to read the code if you need to change things.
With some monkey-patching and inheritance, you can alter the functionality of JWT completely, but you need to spend some time reading.

flasgger: I love flasgger. I was using flask-swagger, but being honest. The heredocs under the function names make the whole thing illegible. Flasgger lets you point to a file. If you don't know what is this, this is used to write documentation for swagger-UI.

Swagger-ui: The only complex thing that I needed to do here, was to add support for the JWT token, since some endpoints are protected by JWT tokens, I didn't had a way to push the token there.
I found the way.

Authentication.

I was able to handle this by using JWT. When you are developing microservices, and you have to manage authentication, there are multiple things to take into account.
Disclaimer: this is not a definitive guide. At the moment I'm experimenting. So far is my best solution, but I believe that there are other solutions out there.
If you split a user service and a permission service, moving that token will become tricky, and it's difficult to think sometimes.
The idea is that you pass around that token until it expires.
No token, no service. Simple.
Obtaining the permissions of the entity after logging is a must, and I think that a caching strategy will be needed there.
A simple example, I log in, I obtain my permissions and put some key, like user_id: permissions in a Redis storage.
On the next request, I will ask the Redis for that user permissions.
When for some reason I decide to revoke a permission, I will delete the Redis record, and well, they will need to fetch the permissions again.

This is a work in progress, which I may extend much further later on.