Chapter 4. Listing files in available Amazon S3 buckets using notebook cells

You can check the files available in buckets you have access to by listing the objects in the bucket. Because buckets use object storage instead of a typical file system, object naming works differently from normal file naming. Objects in a bucket are always known by a key, which consists of the full path in the bucket plus the name of the file itself.

Prerequisites

Procedure

  1. Create a new notebook cell and list the objects in the bucket. For example:

    bucket_name = std-user-bucket1
    s3_client.list_objects_v2(Bucket=bucket_name)

    This returns a number of objects in the following format:

    {Key: docker/registry/v2/blobs/sha256/00/0080913dd3f10aadb34asfgsgsdgasdga072049c93606b98bec84adb259b424f/data,
    LastModified: datetime.datetime(2021, 4, 22, 1, 26, 1, tzinfo=tzlocal()),
    ETag: "6e02fad2deassadfsf900a4bd7344ffe",
    Size: 4052,
    StorageClass: STANDARD}
  2. You can make this list easier to read by only printing the key instead of the full response, for example:

    bucket_name = std-user-bucket1
    for key in s3_client.list_objects_v2(Bucket=bucket_name)[Contents]:
        print(key[Key])

    This returns output similar to the following:

    docker/registry/v2/blobs/sha256/00/0080913dd3f10aadb34asfgsgsdgasdga072049c93606b98bec84adb259b424f/data
  3. You can also filter your query to list for a specific "path" or file name, for example:

    bucket_name = std-user-bucket1
    for key in s3_client.list_objects_v2(Bucket=bucket_name,Prefix=start_of_file_path)[Contents]:
        print(key[Key])