Skip to Content

AWS S3 LS: Displaying Filenames Without Date Information

Sharing is caring!

We strive to provide you with authoritative, trustworthy, and expert advice. In doing so, the staff at clouddropout.com performs extensive research, editing, and fact checking to every post on this webiste. If you feel that this article can improve, please feel free to reach us at staff@clouddropout.com

Before continuing this article, I wanted to let you know that I have a Youtube Channel where I showcase all sorts of video content related to Tech. Subscribing would mean a lot to me, and I very much appreicate all the support!

When working with Amazon Web Services (AWS) Simple Storage Service (S3), you might find yourself in need of listing objects within an S3 bucket without displaying additional information such as dates, file sizes, or times. 

This blog post will explore different methods of achieving this using the AWS Command Line Interface (CLI) and other command-line tools to filter the output and display only the file names.

Listing Files in an S3 Bucket

The basic command for listing files within an S3 bucket is the aws s3 ls command. By default, this command will display a lot of information for each object, including the date, time, and file size. 

If you only want to display the file names, you can achieve this by using command-line tools like awk or jq to manipulate the output.

Using AWK to Display Only File Names

To display only the names of the files (without any additional information), you need to extract the fourth column from the output of the aws s3 ls command. 

You can achieve this using the Unix command-line tool awk. The awk command processes text files and is particularly useful for text manipulation.

bash
aws s3 ls s3://mybucket –recursive | awk ‘{print $4}’

However, this command won’t work correctly if your file names contain spaces because awk separates columns based on whitespace. 

To handle file names with spaces, you can use a more complex awk command that removes the first three columns from each line and then removes any leading whitespace characters.

bash
aws s3 ls s3://mybucket –recursive | awk ‘{$1=$2=$3=””; print $0}’ | sed ‘s/^[ \t]*//’

This command is slightly more complex, but it can handle file names with spaces correctly.

Using JQ to Display Only File Names

Another way to display only the files from the aws s3 ls command is to use the s3api command with jq. jq is a command-line JSON processor that is particularly useful for extracting data from JSON files.

The aws s3api list-objects command lists all objects (files and folders) in the specified S3 bucket. By default, it displays a JSON object that contains information about each object.

To extract only the filenames from the JSON output, you can pipe the output to jq, which is instructed to output only the Key field for each object using the .Contents[].Key expression.

bash
aws s3api list-objects –bucket mybucket –query “Contents[].Key”| jq -r ‘.[]’

Getting the Last Modified File

In some cases, you might want to find the last modified file in an S3 bucket. You can do this using a combination of commands, including sort, tail, and awk. 

Here’s an example command that will download the last modified file from the S3 bucket to the specified local path:

bash
aws s3 ls s3://your-bucket-name –recursive | sort | tail -n 1 | awk ‘{print $4}’ | xargs -I {} aws s3 cp s3://your-bucket-name/{} /local/path/to/save

Replace your-bucket-name and /local/path/to/save with the appropriate values for your use case. Note that this method may not be efficient for buckets with a large number of objects because it requires listing and sorting all objects in the bucket.

In that case, you may want to consider alternative approaches, such as using S3 inventory or setting up a Lambda function to trigger on object creation.

Displaying Directories in an S3 Bucket

When working with directories in an S3 bucket, you might want to display only the directory names. Since S3 treats directories as prefixes, you can use a combination of commands to achieve this.

One solution is to extract the path from the listing, pass it to dirname to extract the directory name, and then use uniq to avoid repeats:

bash
aws s3 ls –recursive <s3Uri> | cut -c32- | xargs -d ‘\n’ -n 1 dirname | uniq

Another solution is to use the aws s3api list-objects-v2 command with the –query option to check for keys ending with a ‘/’ character, which indicates a directory:

bash
aws s3api list-objects-v2 –bucket <your_bucket> –prefix <prefix> –query “Contents[?ends_with(Key, ‘/’)].[Key]” –output text

Replace <your_bucket> and <prefix> with the appropriate values for your use case.

Listing Files in a Date Range

If you want to list all the files uploaded in a virtual folder in S3 within a specific date range, you can use the awkcommand to filter the results based on the date range after fetching all the records using the aws s3 ls command.

bash
aws s3 ls s3://your-bucket-name/path/to/virtual/folder/ –recursive | awk ‘$1 >= “YYYY-MM-DD” && $1 <= “YYYY-MM-DD” {print}’

Replace your-bucket-name, path/to/virtual/folder/, and the date range with the appropriate values for your use case.

Final Thoughts

In this blog post, we explored different methods to display only the file names when using the aws s3 ls command. We covered using awk and jq to manipulate the output, finding the last modified file, displaying directories in an S3 bucket, and listing files in a specific date range.

By combining the AWS CLI with powerful command-line tools, you can create custom commands tailored to your specific needs when working with S3 buckets. Whether you need to display just the file names, directories, or filter results based on certain criteria, these methods can help you achieve your desired output.

Tags

Tags