E1l1dh
January 24, 2023, 1:30pm
1
I am looking to fine tune GPT-J using amazon sagemaker in my local environment. I have been following the tutorials and documentation https://huggingface.co/docs/sagemaker/getting-started and here https://huggingface.co/docs/sagemaker/inference#deploy-with-model_data . I have my own training dataset that is stored in S3 but I am running errors due to IAM roles permissions. There is very little documentation covering what actual permissions are required to train Hugging Face training model using Sagemaker.
If anyone knows what IAM role permissions are required to train a Hugging face model that would be great!
1 Like
crajah
January 26, 2023, 11:29am
2
You can find the list of all SageMaker API IAM Roles at this link => Amazon SageMaker API Permissions: Actions, Permissions, and Resources Reference - Amazon SageMaker
The one most relevant to your use case is the CreateTrainingJob API (CreateTrainingJob - Amazon SageMaker ) that requires the following permissions:
sagemaker:CreateTrainingJob
iam:PassRole
kms:CreateGrant (required only if the associated ResourceConfig has a specified VolumeKmsKeyId and the associated role does not have a policy that permits this action)
To allow the Training Job access data in the S3 Bucket, the following policy should work,
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetBucketLocation",
"s3:GetObject",
"s3:GetObjectVersion",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::YOUR-BUCKET",
"arn:aws:s3:::YOUR-BUCKET/*"
]
}
]
}
To further restrict the S3 Bucket to a particular Training Job, it is possible to specify the exact principal who has access, as so:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:sagemaker:region:account-id:training-job/trainingJobName"
},
"Action": [
"s3:GetBucketLocation",
"s3:GetObject",
"s3:GetObjectVersion",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::YOUR-BUCKET",
"arn:aws:s3:::YOUR-BUCKET/*"
]
}
]
}
1 Like