megfile.s3_path module
- class megfile.s3_path.S3Path(path: str | BasePath | PathLike, *other_paths: str | BasePath | PathLike)[source]
Bases:
URIPath
- absolute() S3Path [source]
Make the path absolute, without normalization or resolving symlinks. Returns a new path object
- access(mode: Access = Access.READ, followlinks: bool = False) bool [source]
Test if path has access permission described by mode
- Parameters:
mode – access mode
- Returns:
bool, if the bucket of s3_url has read/write access.
- copy(dst_url: str | BasePath | PathLike, callback: Callable[[int], None] | None = None, followlinks: bool = False, overwrite: bool = True) None [source]
File copy on S3 Copy content of file on src_path to dst_path. It’s caller’s responsibility to ensure the s3_isfile(src_url) is True
- Parameters:
dst_path – Target file path
callback – Called periodically during copy, and the input parameter is the data size (in bytes) of copy since the last call
followlinks – False if regard symlink as file, else True
overwrite – whether or not overwrite file when exists, default is True
- exists(followlinks: bool = False) bool [source]
Test if s3_url exists
If the bucket of s3_url are not permitted to read, return False
- Returns:
True if s3_url exists, else False
- getmtime(follow_symlinks: bool = False) float [source]
Get last-modified time of the file on the given s3_url path (in Unix timestamp format).
If the path is an existent directory, return the latest modified time of all file in it. The mtime of empty directory is 1970-01-01 00:00:00
If s3_url is not an existent path, which means s3_exist(s3_url) returns False, then raise S3FileNotFoundError
- Returns:
Last-modified time
- Raises:
S3FileNotFoundError, UnsupportedError
- getsize(follow_symlinks: bool = False) int [source]
Get file size on the given s3_url path (in bytes).
If the path in a directory, return the sum of all file size in it, including file in subdirectories (if exist).
The result excludes the size of directory itself. In other words, return 0 Byte on an empty directory path.
If s3_url is not an existent path, which means s3_exist(s3_url) returns False, then raise S3FileNotFoundError
- Returns:
File size
- Raises:
S3FileNotFoundError, UnsupportedError
- glob(pattern, recursive: bool = True, missing_ok: bool = True, followlinks: bool = False) List[S3Path] [source]
Return s3 path list in ascending alphabetical order, in which path matches glob pattern
Notes: Only glob in bucket. If trying to match bucket with wildcard characters, raise UnsupportedError
- Parameters:
pattern – Glob the given relative pattern in the directory represented by this path
recursive – If False, ** will not search directory recursively
missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError
- Raises:
UnsupportedError, when bucket part contains wildcard characters
- Returns:
A list contains paths match s3_pathname
- glob_stat(pattern, recursive: bool = True, missing_ok: bool = True, followlinks: bool = False) Iterator[FileEntry] [source]
Return a generator contains tuples of path and file stat, in ascending alphabetical order, in which path matches glob pattern
Notes: Only glob in bucket. If trying to match bucket with wildcard characters, raise UnsupportedError
- Parameters:
pattern – Glob the given relative pattern in the directory represented by this path
recursive – If False, ** will not search directory recursively
missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError
- Raises:
UnsupportedError, when bucket part contains wildcard characters
- Returns:
A generator contains tuples of path and file stat, in which paths match s3_pathname
- hasbucket() bool [source]
Test if the bucket of s3_url exists
- Returns:
True if bucket of s3_url exists, else False
- iglob(pattern, recursive: bool = True, missing_ok: bool = True, followlinks: bool = False) Iterator[S3Path] [source]
Return s3 path iterator in ascending alphabetical order, in which path matches glob pattern
Notes: Only glob in bucket. If trying to match bucket with wildcard characters, raise UnsupportedError
- Parameters:
pattern – Glob the given relative pattern in the directory represented by this path
recursive – If False, ** will not search directory recursively
missing_ok – If False and target path doesn’t match any file, raise FileNotFoundError
- Raises:
UnsupportedError, when bucket part contains wildcard characters
- Returns:
An iterator contains paths match s3_pathname
- is_dir(followlinks: bool = False) bool [source]
Test if an s3 url is directory Specific procedures are as follows: If there exists a suffix, of which
os.path.join(s3_url, suffix)
is a file If the url is empty bucket or s3://- Parameters:
followlinks – whether followlinks is True or False, result is the same. Because s3 symlink not support dir.
- Returns:
True if path is s3 directory, else False
- is_file(followlinks: bool = False) bool [source]
Test if an s3_url is file
- Returns:
True if path is s3 file, else False
- is_symlink() bool [source]
Test whether a path is link
- Returns:
True if a path is link, else False
- Raises:
S3NotALinkError
- iterdir(followlinks: bool = False) Iterator[S3Path] [source]
Get all contents of given s3_url. The result is in ascending alphabetical order.
- Returns:
All contents have prefix of s3_url in ascending alphabetical order
- Raises:
S3FileNotFoundError, S3NotADirectoryError
- listdir(followlinks: bool = False) List[str] [source]
Get all contents of given s3_url. The result is in ascending alphabetical order.
- Returns:
All contents have prefix of s3_url in ascending alphabetical order
- Raises:
S3FileNotFoundError, S3NotADirectoryError
- load(followlinks: bool = False) BinaryIO [source]
Read all content in binary on specified path and write into memory
User should close the BinaryIO manually
- Returns:
BinaryIO
- md5(recalculate: bool = False, followlinks: bool = False) str [source]
Get md5 meta info in files that uploaded/copied via megfile
If meta info is lost or non-existent, return None
- Parameters:
recalculate – calculate md5 in real-time or return s3 etag
followlinks – If is True, calculate md5 for real file
- Returns:
md5 meta info
- mkdir(mode=511, parents: bool = False, exist_ok: bool = False)[source]
Create an s3 directory. Purely creating directory is invalid because it’s unavailable on OSS. This function is to test the target bucket have WRITE access.
- Parameters:
mode – mode is ignored, only be compatible with pathlib.Path
parents – parents is ignored, only be compatible with pathlib.Path
exist_ok – If False and target directory exists, raise S3FileExistsError
- Raises:
S3BucketNotFoundError, S3FileExistsError
- move(dst_url: str | BasePath | PathLike, overwrite: bool = True) None [source]
Move file/directory path from src_url to dst_url
- Parameters:
dst_url – Given destination path
overwrite – whether or not overwrite file when exists
- open(mode: str = 'r', *, encoding: str | None = None, errors: str | None = None, s3_open_func: ~typing.Callable = <function s3_buffered_open>, **kwargs) IO [source]
Open the file with mode.
- property parts: Tuple[str, ...]
A tuple giving access to the path’s various components
- property path_with_protocol: str
Return path with protocol, like file:///root, s3://bucket/key
- property path_without_protocol: str
Return path without protocol, example: if path is s3://bucket/key, return bucket/key
- protocol = 's3'
- readlink() S3Path [source]
Return a S3Path instance representing the path to which the symbolic link points
- Returns:
Return a S3Path instance representing the path to which the symbolic link points.
- Raises:
S3NameTooLongError, S3BucketNotFoundError, S3IsADirectoryError, S3NotALinkError
- remove(missing_ok: bool = False) None [source]
Remove the file or directory on s3, s3:// and s3://bucket are not permitted to remove
- Parameters:
missing_ok – if False and target file/directory not exists, raise S3FileNotFoundError
- Raises:
S3PermissionError, S3FileNotFoundError, UnsupportedError
- rename(dst_path: str | BasePath | PathLike, overwrite: bool = True) S3Path [source]
Move s3 file path from src_url to dst_url
- Parameters:
dst_path – Given destination path
overwrite – whether or not overwrite file when exists
- save(file_object: BinaryIO)[source]
Write the opened binary stream to specified path, but the stream won’t be closed
- Parameters:
file_object – Stream to be read
- scan(missing_ok: bool = True, followlinks: bool = False) Iterator[str] [source]
Iteratively traverse only files in given s3 directory, in alphabetical order. Every iteration on generator yields a path string.
If s3_url is a file path, yields the file only
If s3_url is a non-existent path, return an empty generator
If s3_url is a bucket path, return all file paths in the bucket
If s3_url is an empty bucket, return an empty generator
If s3_url doesn’t contain any bucket, which is s3_url == ‘s3://’, raise UnsupportedError. walk() on complete s3 is not supported in megfile
- Parameters:
missing_ok – If False and there’s no file in the directory, raise FileNotFoundError
- Raises:
UnsupportedError
- Returns:
A file path generator
- scan_stat(missing_ok: bool = True, followlinks: bool = False) Iterator[FileEntry] [source]
Iteratively traverse only files in given directory, in alphabetical order. Every iteration on generator yields a tuple of path string and file stat
- Parameters:
missing_ok – If False and there’s no file in the directory, raise FileNotFoundError
- Raises:
UnsupportedError
- Returns:
A file path generator
- scandir(followlinks: bool = False) Iterator[FileEntry] [source]
Get all contents of given s3_url, the order of result is not guaranteed.
- Returns:
All contents have prefix of s3_url
- Raises:
S3FileNotFoundError, S3NotADirectoryError
- stat(follow_symlinks=True) StatResult [source]
Get StatResult of s3_url file, including file size and mtime, referring to s3_getsize and s3_getmtime
If s3_url is not an existent path, which means s3_exist(s3_url) returns False, then raise S3FileNotFoundError
If attempt to get StatResult of complete s3, such as s3_dir_url == ‘s3://’, raise S3BucketNotFoundError
- Returns:
StatResult
- Raises:
S3FileNotFoundError, S3BucketNotFoundError
- symlink(dst_path: str | BasePath | PathLike) None [source]
Create a symbolic link pointing to src_path named dst_path.
- Parameters:
dst_path – Destination path
- Raises:
S3NameTooLongError, S3BucketNotFoundError, S3IsADirectoryError
- sync(dst_url: str | BasePath | PathLike, followlinks: bool = False, force: bool = False, overwrite: bool = True) None [source]
Copy file/directory on src_url to dst_url
- Parameters:
dst_url – Given destination path
followlinks – False if regard symlink as file, else True
force – Sync file forcible, do not ignore same files, priority is higher than ‘overwrite’, default is False
overwrite – whether or not overwrite file when exists, default is True
- unlink(missing_ok: bool = False) None [source]
Remove the file on s3
- Parameters:
missing_ok – if False and target file not exists, raise S3FileNotFoundError
- Raises:
S3PermissionError, S3FileNotFoundError, S3IsADirectoryError
- walk(followlinks: bool = False) Iterator[Tuple[str, List[str], List[str]]] [source]
Iteratively traverse the given s3 directory, in top-bottom order. In other words, firstly traverse parent directory, if subdirectories exist, traverse the subdirectories in alphabetical order.
Every iteration on generator yields a 3-tuple: (root, dirs, files)
root: Current s3 path;
dirs: Name list of subdirectories in current directory. The list is sorted by name in ascending alphabetical order;
files: Name list of files in current directory. The list is sorted by name in ascending alphabetical order;
If s3_url is a file path, return an empty generator
If s3_url is a non-existent path, return an empty generator
If s3_url is a bucket path, bucket will be the top directory, and will be returned at first iteration of generator
If s3_url is an empty bucket, only yield one 3-tuple (notes: s3 doesn’t have empty directory)
If s3_url doesn’t contain any bucket, which is s3_url == ‘s3://’, raise UnsupportedError. walk() on complete s3 is not supported in megfile
- Parameters:
followlinks – whether followlinks is True or False, result is the same. Because s3 symlink not support dir.
- Raises:
UnsupportedError
- Returns:
A 3-tuple generator