Configuration
Config Files
aiomegfile reads configuration from a few standard locations:
~/.config/megfile/megfile.conffor generic runtime environment values and alias mappings~/.aws/credentialsfor S3 profiles written byamf config s3~/.hdfscli.cfgfor HDFS profiles written byamf config hdfs
Main Runtime File
The main config file supports the [env] and [alias] sections.
Example:
[env]
MEGFILE_MAX_WORKERS = 16
MEGFILE_READER_BLOCK_SIZE = 16MB
MEGFILE_WRITER_BLOCK_SIZE = 16MB
MEGFILE_WEBDAV_MAX_RETRY_TIMES = 6
WEBDAV_USERNAME = alice
[alias]
datasets = s3://company-datasets/
reports = file:///srv/reports/
Common Environment Variables
Variable |
Meaning |
|---|---|
|
Global retry limit used when a protocol-specific retry limit is not set. |
|
Global concurrency limit for background async work. |
|
Chunk size for buffered readers. |
|
Maximum prefetch buffer size for readers. |
|
Enables lazy prefetch behavior when supported. |
|
Chunk size for buffered writers, especially important for S3 multipart uploads. |
|
Maximum in-memory write buffer size. |
|
Enables writer block auto-scaling when no explicit writer block size is set. |
|
Retry limit for S3 operations. |
|
Retry limit for HDFS operations. |
|
Retry limit for HTTP and HTTPS operations. |
|
Retry limit for SFTP operations. |
|
Retry limit for WebDAV and WebDAVS operations. |
Protocol-specific Settings
Some backends read their own environment variables in addition to the generic settings above.
SFTP
SFTP_PORTSFTP_USERNAMESFTP_PASSWORDSFTP_PRIVATE_KEY_PATHSFTP_PRIVATE_KEY_PASSPHRASESFTP_CONNECT_TIMEOUTSFTP_KEEPALIVE_INTERVALSFTP_MAX_UNAUTH_CONNECTIONSMEGFILE_SFTP_HOST_KEY_POLICY
WebDAV
WEBDAV_USERNAMEWEBDAV_PASSWORDWEBDAV_TOKENWEBDAV_TOKEN_COMMANDWEBDAV_TIMEOUTWEBDAV_INSECURE
HDFS
HDFS_USERHDFS_URLHDFS_ROOTHDFS_TIMEOUTHDFS_TOKENHDFS_CONFIG_PATH
Using CLI Helpers
The CLI can write the common config files for you:
amf config s3 <access_key> <secret_key> --profile-name default
amf config hdfs http://namenode:9870 --profile-name prod --user hdfs
amf config alias datasets s3://company-datasets/
amf config env MEGFILE_MAX_WORKERS=16
Alias Resolution
If you define:
[alias]
datasets = s3://company-datasets/
Then datasets://images/cat.jpg resolves to
s3://company-datasets/images/cat.jpg.