DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Low-Code Development: Leverage low and no code to streamline your workflow so that you can focus on higher priorities.

DZone Security Research: Tell us your top security strategies in 2024, influence our research, and enter for a chance to win $!

Launch your software development career: Dive head first into the SDLC and learn how to build high-quality software and teams.

Open Source Migration Practices and Patterns: Explore key traits of migrating open-source software and its impact on software development.

Related

  • Machine Learning With Python: Data Preprocessing Techniques
  • Data Analytics Using Python
  • LSTM Single Variate Implementation Approach: Forecasting
  • Files and Exceptions in Python

Trending

  • Contexts in Go: A Comprehensive Guide
  • How To Perform JSON Schema Validation in API Testing Using Rest-Assured Java
  • The Rise of Kubernetes: Reshaping the Future of Application Development
  • Setting up Device Cloud: A Beginner’s Guide
  1. DZone
  2. Data Engineering
  3. Data
  4. Implementing RSA in Python From Scratch

Implementing RSA in Python From Scratch

In this article, explore an easy-to-understand explanation of the mathematical background behind RSA cryptography algorithms.

By 
Traven West user avatar
Traven West
·
Apr. 04, 24 · Analysis
Like (1)
Save
Tweet
Share
1.3K Views

Join the DZone community and get the full member experience.

Join For Free

Please note that it is essential for me to emphasize that the code and techniques presented here are intended solely for educational purposes and should never be employed in real-world applications without careful consideration and expert guidance.

At the same time, understanding the principles of RSA cryptography and exploring various implementations is valuable for educational purposes, and understanding how to code encryption methods, and building secure encryption systems for practical use requires adherence to industry standards, best practices, and thorough security assessments. An inadequate implementation or improper usage of cryptographic techniques can lead to severe vulnerabilities, jeopardizing the confidentiality and integrity of sensitive data.

I'm sure many programmers, particularly web developers have heard of the RSA cryptography system. RSA is an asymmetric cryptography system, meaning that one key is used for encryption and the other for decryption. I've seen a lot of articles explaining the general principles of asymmetric cryptography, but I have not seen any that give easy-to-understand explanations of the mathematical background behind these algorithms, so I decided to write this article.

Explanation of the Math Behind RSA

To start, as I've mentioned in the paragraph above, to transmit an RSA-encrypted message, you need 2 keys. One that can only encrypt and one that can decrypt. Let the encryption key be denoted as e, the decryption key will be denoted as d and the message will be denoted as m. The key principle behind RSA is that in the following notion:

(m^e)^d ≡ m (mod n)

The difficulty of finding d, knowing e and n can be very hard if these numbers are chosen properly (as will be demonstrated below).

But first, we need to introduce some new concepts in number theory.

  • a ≡ b (mod n) means that there is an integer x such that a + nx = b. A more intuitive explanation is that the reminder of a / n equals the reminder of b / n.
  • gcd(a, b) is the greatest number that divides both a and b.
  • lcm(a, b) is the smallest number that is a multiple of both a and b.
  • λ(n) is a number such that x^λ(n) ≡ 1 (mod n) for any x such that gcd(x, n) = 1. From this you can conclude that x^a ≡ x^b (mod n) if a ≡ b (mod λ(n))

Generating Keys

Let's make n = pq where p and q are 2 prime numbers. Since λ(p) = p - 1 and λ(q) = q - 1 (lookup Fermat's little theorem proof for why), λ(n) = (p - 1) * (q - 1).

Now we have to solve ed ≡ 1 (mod λ(n)). e can be some prime number (usually 65537) and d can be calculated using extended Euclidian's algorithm (will be written and explained in the code section) from the equation ed + xλ(n) = gcd(e, λ(n)) (d and x are coefficients to be calculated).

pair (e, n) is used for encryption ((m^e) % n) and pair (d, n) is used for decryption ((m^d) % n)

After computing the keys, p and q can be thrown away. Notice that without p and q, finding λ(n) would mean factorizing n, which is not an easy problem to solve for values of n up to 2^2048, which are regularly used.

Implementing RSA in Python

First list procedures and their steps:

Keys Generation

  • Find 2 random prime numbers, p and q
  • Compute n = p * q and λ(n) = (p - 1) * (q - 1)
  • Make e equal some prime number, e.g. e = 35537
  • Compute d from equation ed + λ(n)x = gcd(e, λ(n)) = 1 using Extended Euclidian Algorithm (from this point on we will call it eucalg)

Encryption

  • Divide the message into sections (of 256 bits if n is 2048-bit)
  • Each section (denoted as m) is encrypted as (m^e) % n

Decryption

  • Divide the message into sections, same as above
  • Each section (denoted as m) is decrypted as (m^d) % n

Implementation of Extended Euclidian Algorithm

This algorithm relies on the fact that if x divides both a and b, there will be a pair of coefficients (k, l) such that:
a * k + b * l = x
The algorithm finds these coefficients (and x) by repeatedly substracting the lesser argument from the bigger one, until the lesser one becomes 0. Instead of repeatedly substracting, you can calculate how many times b can be substracted from a (k = a // b) and then substratct b*k. This optimization makes the algorithm run in O(log max(a, b)) which is a lot quicker.
Below is an implementation of the algorithm in Python:

Python
 
def eucalg(a, b):
     # make a the bigger one and b the lesser one
     swapped = False
     if a < b:
         a, b = b, a
         swapped = True
     # ca and cb store current a and b in form of
     # coefficients with initial a and b
     # a' = ca[0] * a + ca[1] * b
     # b' = cb[0] * a + cb[1] * b
     ca = (1, 0)     cb = (0, 1)
     while b != 0:
         # k denotes how many times number b
         # can be substracted from a
         k = a // b
         # a  <- b
         # b  <- a - b * k
         # ca <- cb
         # cb <- (ca[0] - k * cb[0], ca[1] - k * cb[1])
         a, b, ca, cb = b, a-b*k, cb, (ca[0]-k*cb[0], ca[1]-k*cb[1])
     if swapped:         
return (ca[1], ca[0])
     else:
         return ca


Notice that the function above can produce negative numbers for coefficients, so in case that happens, all that we need to do is set d to λ(n) - d.

Implementing Fast Exponentiation With Modulo

Some might suggest just using (b ** e) % n, but this approach is not very good time and memory-wise because b ** e can be very big despite only the last 2000 bits or so being needed. The solution is to reimplement exponentiation by calculating modulo after every division.

The exponentiation implementation below has logarithmic time complexity. Instead of iterating from 1 to e and multiplying r (result) by b, it iterates from the most significant bit of e to the least significant bit of e, and at each bit it does r = r*r + bit. This works because if r equals b^x and you're appending a bit to the end of x, new x will be x * 2 + bit.

Python
 
def modpow(b, e, n):
     # find length of e in bits
     tst = 1
     siz = 0
     while e >= tst:
         tst <<= 1
         siz += 1
     siz -= 1
     # calculate the result
     r = 1
     for i in range(siz, -1, -1):
         r = (r * r) % n
         if (e >> i) & 1: r = (r * b) % n
     return r


Key Generation, Encryption, and Decryption

Keys generation, encryption, and decryption have all been explained in the math section and the code below is simply the implementation of that.

Python
 
def keysgen(p, q):     n = p * q     lambda_n = (p - 1) * (q - 1)     e = 35537     d = eucalg(e, lambda_n)[0]     if d < 0: d += lambda_n        # both private and public key must have n stored with them     return {'priv': (d, n), 'pub': (e, n)} 
def numencrypt(m, pub):     return modpow(m, pub[0], pub[1]) 
def numdecrypt(m, priv):     return modpow(m, priv[0], priv[1])


Testing the Code We Have Until Now

I stored the code in a file named rsa.py and ran the following in a Python shell:

Python
 
>>> import rsa >>> keys = rsa.keysgen(31337, 31357) 
>>> keys {'priv': (720926705, 982634309), 'pub': (35537, 982634309)} 
>>> priv = keys['priv'] 
>>> pub = keys['pub'] 
>>> msg = 80087 >>> enc = rsa.numencrypt(msg, priv)
>>> enc 34604568 
>>> dec = rsa.numdecrypt(enc, pub) 
>>> dec 80087 
>>> 


Conclusion

At the end of writing this article, I realized that the implementation of a usable Python RSA program is more complicated than I initially speculated. Because of that, I decided to split the topic up into multiple articles. With the code presented in this article, you have all the core parts of RSA written. But you still need a random prime generator and plaintext encryption (numencrypt and numdecrypt are suitable only for numbers m smaller in size than n). All these will be included in the next article that will be published shortly.

Algorithm Python (language) Data (computing)

Published at DZone with permission of Traven West, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Machine Learning With Python: Data Preprocessing Techniques
  • Data Analytics Using Python
  • LSTM Single Variate Implementation Approach: Forecasting
  • Files and Exceptions in Python

Partner Resources


Comments

ABOUT US

  • About DZone
  • Send feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: