Table of Contents

Description

Last week I got asked by a customer what I think about the AWS Firewall and if it would be wise for him to implement it in his environment.
I browsed fast through the various tech documents about it, got initially confused and realised that this is not one of those topics that I was going to understand fast and move on.

I invested more time into it, started reading about rule groups, stateless, statefull rules and … I needed a break. There was surely no quick answer in sight.

Going into the topic even deeper I reached the proverbial French saying “je ne sais quoi”.

Would I implement this firewall in my own environment ?

Maybe
Then again I’m a perfectionist, always afraid of failing…

I tend to find limitations, restrictions, imperfections and get easily disappointed.

If you want to learn I invite you to join me in this journey :)

What is AWS Firewall

If we look in the AWS docs we see:

AWS Network Firewall is a stateful, managed, network firewall and intrusion detection and prevention service for your virtual private cloud (VPC) that you created in Amazon Virtual Private Cloud (Amazon VPC). With Network Firewall, you can filter traffic at the perimeter of your VPC.

Sounds pretty compelling, no ?

when something is too good to be true…double check

Basic concepts

Components:

  • 1 x FW
    • deploys a leg (like the AWS PrivateLink ~ Endpoint Service) in an AZ inside a VPC
    • and another leg in another AZ for redundancy
  • 1 x FW Policy that is attached 1:1 to a FW
    • the Policy Groups together the Rule Groups that you apply to the FW
    • also the AWS Managed Lists
  • AWS Managed Lists
    • malicious FQDN lists
    • threat rule groups (emerging threats, botnet actions, malware, exploits)
  • Rule Groups
    • Stateless (always first processed)
    • Stateful
      • Standard FW rules
      • FQDN
      • IDS/IPS

Logic and Flow:

SUM[Rule Groups (Stateful RG + Stateless RG) ] + AWS Managed Lists --> 1 x FW Policy --> 1..n x FW

Worth mentioning that AWS FW uses Suricata for both FQDN and IDS/IPS. In IDS/IPS rules it supports the drop keyboard.

When a packet arrives to the FW:

  • It first goes to the Stateless Rule Groups.
  • If there is a match and a clear action -> processing stops.
    • except if the action == FWD TO STATEFUL
  • If there is NO match, but a default action -> continues from there (pass/drop/FWD TO STATEFUL).
  • It reaches the Stateful Engine
    • again goes through the rules until a match
    • goes as per Priority Order if FW Policy configured in STRICT MODE for it
    • groups first PASS, then DROP, then REJECT, then ALERT if FW Policy configured in DEFAULT MODE

Rule Groups

Stateless

A rule group can only contain EITHER stateful OR stateless rules.

The basic configuration for a stateless rulegroup you can find here:

What jumps into the eyes is that you can:

  • predefine a capacity (~ TCAM) for the Rule Group; this is setting a max resource cap on it
  • Actions are:
    • Pass = Accept the packet, stop processing
    • Drop = Drop the packet
    • Forward to stateful Rule Group (next in line for processing it)
      • here you have the more fancy stuff
      • IDS/IPS
      • FQDN

At Firewall Policy (groups together Rule Groups) level there’s a default action to set for packets that did NOT match a Stateless RG.
There it can also be:

  • Pass
  • Drop
  • Forward to stateful rule groups

In Custom Actions you can configure CloudWatch Logging.
It will only be this as I mentioned, statistics.
Full logging only works for Stateful rules.

Stateful

This is where the strength of the AWS Firewall lies.
When processing Rule Groups, the FW first sends a packet through the Stateless RGs
and then depending on the action
(default of FW Policy if no match | stateless RG if matched) = FWD to Stateful RG
the processing will continue here.

There are 3 types of Stateful RG:

  • Standard (allow/deny)
  • FQDN (implemented via Suricata)
  • IDS/IPS (also via Suricata)

Stateful Rule order

This has to probably be the most confusing topic or at least it was for me when I struggled to read through the AWS docs.

Reminder: 1 x FW policy = SUM(Rule Groups)

What it does:

  • Strict Mode
    • Rule Groups and Rules inside are processed as per Priority defined
  • Default Mode
    • the FW puts everything together and then
    • First process the Pass Rules
    • Then Drop
    • Then Alert (Logging)

The Default Mode at FW Policy level seems you can OVERWRITE inside RuleGroup level. Then rules inside that RG will be processed as per priority.

Only applies to Stateful Rules.

Stateful RG level:

FW policy level:

Standard

You define a rule group:

  • set a capacity (~TCAM) that it
  • add your rules (TCP/UDP/ICMP)
  • and an action:
    • Pass
    • Drop
    • Alert (logging)

Rule definition

How it looks afterward:

Variables

Variables are quite cool to use in Rule Groups as they make your life easier when groupping together IPs/VMs for filtering purposes.

Here you have 2 choices:

Rule Variables They behave like Linux Environment variables.
You define a VAR_NAME and associate it with a list of CIDRs.

IPSET references Here there are 2 subtypes.

  • Managed Prefix Lists (Virtual Private Cloud -> Managed Prefix Lists menu to define them)
  • Network Firewall Resource Groups (Network Firewall -> Network Firewall Resource Groups menu)

The latter I find pretty cool.
It allows you to create a Dynamic Prefix List based on either EC2/VM attributes (TAGs) or EC2 ENIs.

FQDN

This one is pretty much self-explanatory:

  • list of domains to match
  • allow/deny
  • whether only for specific sources or for ALL traffic
    • can define CIDRs
    • no support for variables/ip sets like above

IDS/IPS

Here you define your rules in Suricata format with an action:

  • pass
  • drop
  • alert

You can find examples of rules that are publicly available here:
Proofpoint Emerging Threat Rules

FAQ

How do I get traffic to arrive to my FW ?

You need to add a route in your subnet pointing a destination (E-W or Egress 0/0) toward the FW VPC Endpoint.

Are the Firewall Resources unlimited ?

No. There’s some Capacity concept that reminds of TCAM for switches.
If you add an AWS managed Rule Group to your FW Policy you will see a Capacity field and how much it is being consumed.
If you add a custom Rule Group you will also have the possibility to set a limit on how much Capacity it can consume for sum[Rules(stateless|stateful)]
The Total is around 30k per Firewall.
1 x FW rule ! = 1 x Capacity
Depends on how AWS algorithm manages to group together rules when rendering them on the FW (this means you can have more rules eating just 1 x capacity unit for example).

How it works - Life of a packet

Use case:
Workloads that wants to go on the internet:

  • Workload sends the packet to the VPC Router
  • There you program a route 0/0 via the AWS FW VPC Endpoint
  • The Endpoint sends the traffic to the FW
  • The FW does the processing
  • Packet arrives back on the VPC Endpoint Subnet and Route Table
  • There it finds a 0/0 via IGW, goes out

What about those green little thingies with 2A, 2B, 2C ?

  • When a packet hits the FW it first goes through the Stateless Rule Groups (SUM[Stateles FW rules inside the Rule Group])
  • depending on the decission it is Accepted/Dropped or goes to Stateful
  • if going to Stateful then -> does it match SRC/DST IP/Port from TLS Inspection ?
    • if yes, then do TLS man in the middle, then FWD to Stateful
    • if not, FWD to Stateful
  • Stateful Match
    • Standard Rules
    • FQDN
    • IDS/IPS

NAT

NAT what ?

The AWS FW cannot sadly do any sort of NAT.
This means you still need to add an IGW or a NAT GW in the mix, but on the positive side AWS has waived the pricing for the latter.
(as long as you pay the security license one for the FW).

Packet Flow Workload-Internet:

  • Workload sends smth toward Internet
  • It hits Protected Subnet RT -> finds 0/0 via VPC EP of AWS FW
  • AWS FW receives the traffic, takes a decission
  • Packet returns to VPC EP on Firewall Subnet RT -> route lookup
  • finds 0/0 via IGW
  • reaches IGW, does SNAT transparently (to 54.0.0.10), goes out

Packet Flow return traffic:

  • Internet destination replies
  • IGW receives it (dst 54.0.0.10), reverses NAT (~DNAT), now has DST = Workload = 10.0.0.10
  • looks up Ingress RT finds 10.0.0.0/24 via VPC EP of AWS FW
  • AWS FW gets the packet, allow/deny, if allow
  • Packet returns to VPC EP in Firewall Subnet Route Table
  • Finds 10.0.0.0/16 via local -> sends to Workload inside the VPC

FQDN/TLS

Here I find the AWS Website information confusing.
I wonder if it’s on purpose.

So TLS is supported….

  • Ingress from Internet to Workloads
  • E-W

NOT EGRESS like Azure FW, Aviatrix, Fortinet/Checkpoint/Palo

Weird, right ?

Then for pricing:

It does not get better… First of all support is just in Sydney and Ireland.
Then for pricing sometimes I pay an extra Advanced Inspectio/GB on top and sometimes not.

Moving onward a few implementation details and caveats:

  • FQDN is done by Suricata after TLS decryption
  • FQDN seems to look at HTTP header and SNI
  • TLS inspection is NEW, since March 2023 -> forgive its beta character
  • For it to work you cannot import your CA or use an AWS stored CA
  • You need individual certificates !!!!
  • You can use max 10 !!!!
  • Have more than 10 services that requires TLS inspection on Ingress from Internet or E-W -> BAD LUCK
  • You can have 20 TLS inspection configs, but then again max 10 certs (“cool”)

The setup is pretty easy.
Associate a TLS decrypt certificate with your new TLS Inspection Config:

You give it a name:
Define for which traffic to apply it (SRC/DST Address/Port)
Remember that this will match after the traffic went out from the Stateless Rule Groups (sum[rules])
Review and confirm:

Other things to keep in mind:

  • Fail-Close behavior (non-TLS traffic hitting TLS Config = DROP)
  • the technology will probably mature in the future as people start using it and AWS starts bringing it on par with other industry FWs

Pricing

I don’t normally put Pricing in-between but that Advanced License / GB in some regions I found super confusing so felt the need to follow-up.

Oh wait, it did not get better…

Now I have:

  • Regions with pricing 0 for Advanced Inspection Traffic Processing (as per the official note) -> why some so, some not ?
  • Regions that are not supposed to have AWS FW TLS Inspection have pricing for it -> living in the future ?
  • NAT GW prices are clearly waived as said before

IDS/IPS

Here you can define your own rules (good examples here Proofpoint Emerging Threat Rules)

Or import the AWS managed ones for both FQDN and IDS/IPS:

In total they are around ~ 20-22k so that still leaves enough capacity until the 30k max for your own custom rule groups (sum[rules]).

Define your own like this inside a RuleGroup of type Suricata compatible rule string:

Logging

Only for STATEFUL RULES.
Stateles can only send statistics to Cloudwatch optionally = no of packets accepted/dropped/etc.

For what statefully processed packets to alert and what exactly:

  • what matched and headers of the packets (if action is DROP/ALERT/REJECT)
  • flow logs

Example of an ALERT log:

Logging destinations:

  • S3 -> can query it with Athena or build Dashboards with Quicksight
  • CloudWatch
  • Kinesis Firehose -> can stream to 3rd parties

Not clear how to do a query in Athena ? -> ChatGPT really works well !! I tested it myself


AWS Firehose has various integrations and can stream the data then onward to your SIEM, to Splunk, to an ELK deployment.

FW Designs

Very good post from Evgeny here: AWS FW Designs

Distributed

You place an AWS FW Endpoint in each VPC that needs it.

BEWARE: Max 10 x AWS FW / Region

Centralized

This one uses a TGW.
You are routing your traffic between instances in different VPCs via TGW.
The TGW sends it out toward the Security/Inspection VPC.
AWS FW does its magic, then sends it back to TGW.
TGW returns it to the destination VPC (if a workload) or sends it to an Egress VPC + NAT GW (if Internet destined).

Limits

The ones that stick out:

  • Max 5 x FWs / Region (distributed scenario might not be enough)
  • 20 x RuleGroups (Stateful|Stateless)
    • each can contain multiple Rules so ~ OK
  • IP SET references just 5
  • 30000 Capacity (~TCAM like)
  • Stateless logs have no logging, only statistics regarding accepted/dropped packets
  • max 20 TLS inspection configs but even so max 10 SSL certificates for your domains
  • maintaining routes to the FWs is a cumbersome task -> up to the user

Takeaways

  • Two level of rules (stateful/stateless)
  • Confusing design of rules + ordering (Default/Strict)
  • Grouping of resources (IPSET / ResourceGroups, nice one based on VM Tags)
  • TCAM allocation + Quotas / RuleGroup
  • Manual Routing, management overhead
  • NO SNAT/DNAT support (extra NAT GW needed)
  • TLS decryption seems more of a “beta” release
  • No Quic, TLS v1.3 Encrypted SNI, Encrypted Client Hello (no other vendor does it either though)
  • No Geo Blocking
  • IDS can use AWS Managed Rulesets (quite a lot of rules inside)
  • Flexible Logging + SIEM integrations
    • just for Stateful Rules
  • Pricing (*no NATGW charge) but confusing information on the AWS Website