diff --git a/README.md b/README.md
index 9422f99..f02f806 100644
--- a/README.md
+++ b/README.md
@@ -1,41 +1,67 @@
# noneroll
-This Google Apps Script moves your low priority (promotional, social, political, etc.) emails in Gmail to a custom label, and sends you a synopsis of recieved emails every morning for review.
+This Google Apps Script moves your low priority (promotional, social, political, etc.) emails in Gmail to a custom label and sends you a daily synopsis of received emails for review. Basically, this is `unroll.me` without any third-party access or [creepy data brokering](https://archive.is/30hj9).
-Basically, this is `unroll.me` without any 3rd party access or [creepy data brokering](https://www.nytimes.com/2017/04/24/technology/personal-data-firm-slice-unroll-me-backlash-uber.html).
+## New Features & Improvements
+
+- **Global Constants:**
+ Configure the script easily by setting global constants for your spreadsheet ID, sheet name, Gmail label, and email digest time range.
+
+- **Performance Enhancements:**
+ The script now uses JavaScript Sets and batch updates (instead of multiple `appendRow` calls) to handle larger datasets more efficiently. My personal instance of this script filters messages from almost 1300 email addresses.
+
+- **Automatic Deduplication:**
+ A new `dedupeEmails` function is included to automatically remove duplicate email addresses from your spreadsheet. You can schedule this to run periodically.
+
+- **Improved Unsubscribe Link Extraction & HTML Template:**
+ The unsubscribe link extraction has been refined, and the HTML template now uses centralized CSS.
+
+- **Modern Code Practices:**
+ The script has been updated with ES6 syntax (`const`, `let`, for‑of loops) for improved clarity and maintainability.
## Configuration
-1. Create a new spreadsheet in Google Sheets. Name it whatever you'd like.
+1. **Create a Google Sheet:**
+ Create a new spreadsheet in Google Sheets. This spreadsheet will be used to keep track of low priority email addresses.
- - This spreadsheet will be used to keep track of low priority email addresses.
+2. **Create a Gmail Label:**
+ Create a new Gmail label (e.g., `zbulk`).
+ If you use a different label, update the `ZBULK_LABEL` constant in the script accordingly.
-2. Create a new Gmail label. Mine is called `zbulk`, but you are more than welcome to be creative.
+3. **Set Up the Google Apps Script Project:**
+ - Create a new Google Apps Script project.
+ - Copy the updated code into the project's `Code.gs` file.
+ - Replace the placeholder `SHEET_ID` (currently set to `'sheetID'`) with the actual ID of the spreadsheet (the string between `spreadsheets/d/` and `/edit` in your sheet's URL).
+ - Update `SHEET_NAME` if your target sheet name differs from the default ("Sheet1").
- - Should you decide on a different label, replace every instance of `zbulk` in the `noneroll.gs` file to whatever name you have choosen.
+4. **Configure the HTML Template:**
+ - Include the updated HTML file named `email.html` in your project. This file controls the appearance of your daily digest.
-3. Create a new Google Apps Script project.
+5. **Set Up Triggers:**
+ Configure time-driven triggers for:
+ - **`arch`:** Every 15 minutes (or as desired)
+ Archives, labels, and marks low priority emails as read.
+ - **`addEmail`:** Daily (e.g., between midnight and 1am)
+ Batches and appends new email addresses to your spreadsheet.
+ - **`noneroll`:** Daily (e.g., between 5 and 6am)
+ Sends a digest email summarizing emails from the past 24 hours.
+ - **`dedupeEmails`:** Weekly (every Sunday, between 11pm and midnight)
+ Automatically deduplicates the email list in your spreadsheet.
- - Copy code from `noneroll.gs` to the project's `Code.gs` file.
- - Replace `sheetID` with the ID of the sheet (everything after `spreadsheets/d/` and before `/edit#` in the sheet's URL) that you created in Step 1.
+## Usage & Maintenance
-4. Create project triggers. I am currently using the following settings:
+- **Tagging Emails:**
+ Tag any emails you wish to include in your low priority digest with your chosen Gmail label (e.g., `zbulk`).
- - `arch` every 15 mins.
- - Archives and labels email.
- - `addEmail` between midnight and 1am.
- - Adds labeled email addresses to spreadsheet.
- - `noneroll` between 5 and 6am.
- - Sends digest of emails from past 24 hours.
+- **Spreadsheet Data:**
+ Column A of your spreadsheet will be populated with the email addresses of low priority senders.
+ The script’s `addEmail` function should prevent duplicate entries, but the `dedupeEmails` function will periodically clean up any accidental duplicates.
-## Notes
+- **Disabling Low Priority Flag:**
+ To stop an email from being flagged as low priority, remove its corresponding email address from your spreadsheet and (for safety) remove the label from any related messages in Gmail.
-- Tag any emails that you wish to add to your low priority email digest with `zbulk` (or your custom label).
-- Column A of your spreadhsheet will be populated with the email addresses of mail you consider low priority.
-- It is a good idea to occasionally open your spreadsheet and select *"Data > Remove duplicates"* from the menu to prevent email addresses from appearing multiple times in your spreadsheet.
- - Pull requests to fix this (see: automagically deduping email addresses) are welcome.
-- If you want to stop mail from being flagged as low priority, remove the corresponding email address(es) from your spreadsheet and (to be safe) remove the label from any messages in `zbulk` (or your custom label).
-- Low priority emails will be archived, labeled, and marked as read every 15 minutes.
+- **Performance Considerations:**
+ The updated script handles thousands of entries efficiently. For extremely large datasets, consider archiving older data or scheduling deduplication more frequently.
## Credits
diff --git a/email.html b/email.html
index 152c65d..7dda08e 100644
--- a/email.html
+++ b/email.html
@@ -1,40 +1,85 @@
-
-
+
+
Bulk Email Summary
+
-
+
diff --git a/noneroll.gs b/noneroll.gs
index 1713d70..6c3aeba 100644
--- a/noneroll.gs
+++ b/noneroll.gs
@@ -1,109 +1,165 @@
-//run every hour to archive any new email from email in sheet
-function arch() {
- var sheet = SpreadsheetApp.openById('sheetID');
- var range = sheet.getDataRange();
- var values = range.getValues();
- values = [].concat.apply([], values);
- //Logger.log(values)
- var threads = GmailApp.getInboxThreads();
- var label = GmailApp.getUserLabelByName('zbulk')
- for (var i = 0; i < threads.length; i++) {
- if(threads[i].isInInbox()){
- var msg = threads[i].getMessages()[0];
- var email = msg.getFrom().replace(/^.+<([^>]+)>$/, "$1");
- if(values.indexOf(email) > -1){
- threads[i].addLabel(label);
- threads[i].markRead();
- threads[i].moveToArchive();
- }
- }
- }
+// Global Constants
+const SHEET_ID = 'sheetID'; // Replace with your spreadsheet ID
+const SHEET_NAME = 'Sheet1'; // Update if your sheet is named differently
+const ZBULK_LABEL = 'zbulk'; // Change if you use a different Gmail label
+const DELAY_DAYS = 1; // Number of days (1 = last 24 hours)
+
+// Helper to extract the email address from a sender string.
+function extractEmail(fromStr) {
+ const match = fromStr.match(/<([^>]+)>/);
+ return match ? match[1] : fromStr;
}
-//run every day to add any emails added to zbulk folder to sheet
-function addEmail() {
- var sheet = SpreadsheetApp.openById('sheetID');
- var range = sheet.getDataRange();
- var values = range.getValues();
- values = [].concat.apply([], values);
- var label = GmailApp.getUserLabelByName('zbulk')
- var threads = label.getThreads();
- for (var i = 0; i < threads.length; i++) {
- var msg = threads[i].getMessages()[0];
- var email = msg.getFrom().replace(/^.+<([^>]+)>$/, "$1")
- if(values.indexOf(email) < 0){
- sheet.appendRow([email])
+// Run every hour to archive any new email from email addresses in the sheet.
+function arch() {
+ const spreadsheet = SpreadsheetApp.openById(SHEET_ID);
+ const sheet = spreadsheet.getSheetByName(SHEET_NAME);
+
+ // Retrieve emails from the first column and convert to a Set for efficient lookups.
+ const sheetData = sheet.getRange(1, 1, sheet.getLastRow(), 1).getValues().flat();
+ const sheetEmails = new Set(sheetData);
+
+ const threads = GmailApp.getInboxThreads();
+ const label = GmailApp.getUserLabelByName(ZBULK_LABEL);
+
+ for (const thread of threads) {
+ if (thread.isInInbox()) {
+ const msg = thread.getMessages()[0];
+ const email = extractEmail(msg.getFrom());
+ if (sheetEmails.has(email)) {
+ thread.addLabel(label);
+ thread.markRead();
+ thread.moveToArchive();
+ }
}
}
}
+// Run every day to add any emails from the zbulk folder to the sheet.
+function addEmail() {
+ const spreadsheet = SpreadsheetApp.openById(SHEET_ID);
+ const sheet = spreadsheet.getSheetByName(SHEET_NAME);
+
+ // Retrieve emails from the first column and convert to a Set for efficient lookups.
+ const sheetData = sheet.getRange(1, 1, sheet.getLastRow(), 1).getValues().flat();
+ const sheetEmails = new Set(sheetData);
+
+ const label = GmailApp.getUserLabelByName(ZBULK_LABEL);
+ const threads = label.getThreads();
+
+ // Collect new emails in an array for batch insertion.
+ const newEmails = [];
+
+ for (const thread of threads) {
+ const msg = thread.getMessages()[0];
+ const email = extractEmail(msg.getFrom());
+ if (!sheetEmails.has(email)) {
+ newEmails.push([email]);
+ sheetEmails.add(email); // Update the set to avoid duplicates in this run.
+ }
+ }
+
+ // Append all new emails at once.
+ if (newEmails.length > 0) {
+ const startRow = sheet.getLastRow() + 1;
+ const range = sheet.getRange(startRow, 1, newEmails.length, 1);
+ range.setValues(newEmails);
+ }
+}
+
+// Automatically deduplicate email addresses in your spreadsheet.
+function dedupeEmails() {
+ const spreadsheet = SpreadsheetApp.openById(SHEET_ID);
+ const sheet = spreadsheet.getSheetByName(SHEET_NAME);
+
+ // Read all email addresses from column A.
+ const data = sheet.getRange(1, 1, sheet.getLastRow(), 1).getValues();
+ const uniqueEmails = [];
+ const seen = new Set();
+
+ // Iterate over each row.
+ for (let i = 0; i < data.length; i++) {
+ let email = data[i][0].toString().trim();
+ if (email && !seen.has(email)) {
+ seen.add(email);
+ uniqueEmails.push([email]);
+ }
+ }
+
+ // Clear the existing data and write back only unique emails.
+ sheet.clearContents();
+ if (uniqueEmails.length > 0) {
+ sheet.getRange(1, 1, uniqueEmails.length, 1).setValues(uniqueEmails);
+ }
+}
+
+// Retrieve emails from the last 24 hours that are still in the zbulk label.
function getEmails() {
- var delayDays = 2
- var maxDate = new Date();
- maxDate.setDate(maxDate.getDate() - delayDays);
- var label = GmailApp.getUserLabelByName("zbulk");
- var threads = label.getThreads();
- var data = []
- for (var i = 0; i < threads.length; i++) {
- if (threads[i].getLastMessageDate()>maxDate){
- var d = {}
- d.permalink = threads[i].getPermalink()
- d.subject = threads[i].getFirstMessageSubject()
- var from = threads[i].getMessages()[0].getFrom()
- d.from = from.replace(/\"|<.*>/g,'')
- d.date = threads[i].getLastMessageDate()
- if (from.match(/<(.*)>/)!==null){
- d.email = from.match(/<(.*)>/)[1]
- } else {
- d.email = from
- }
- d.uns = null
+ const maxDate = new Date();
+ maxDate.setDate(maxDate.getDate() - DELAY_DAYS);
+
+ const label = GmailApp.getUserLabelByName(ZBULK_LABEL);
+ const threads = label.getThreads();
+ const data = [];
+
+ // Check every thread.
+ for (const thread of threads) {
+ const lastMsgDate = thread.getLastMessageDate();
+ if (lastMsgDate > maxDate) {
+ const msg = thread.getMessages()[0];
+ const permalink = thread.getPermalink();
+ const subject = thread.getFirstMessageSubject();
+ const from = msg.getFrom();
+ const fromClean = from.replace(/\"|<.*>/g, '');
+ const emailMatch = from.match(/<([^>]+)>/);
+ const email = emailMatch ? emailMatch[1] : from;
+ let unsubscribe = null;
- var uns = threads[i].getMessages()[0].getRawContent().match(/^list\-unsubscribe:(.|\r\n\s)+<(https?:\/\/[^>]+)>/im);
- if(uns) {
- d.uns = uns[uns.length-1]
+ // Try to extract the unsubscribe link from the raw content.
+ const rawContent = msg.getRawContent();
+ const unsMatch = rawContent.match(/^list\-unsubscribe:(.|\r\n\s)+<(https?:\/\/[^>]+)>/im);
+ if (unsMatch) {
+ unsubscribe = unsMatch[unsMatch.length - 1];
} else {
- var rex = /.*?]*href=["'](https?:\/\/[^"']+)["'][^>]*>(.*?[Uu]nsubscribe.*?)<\/a>.*?/gi
- body_t = threads[i].getMessages()[0].getBody()
- while(u = rex.exec(body_t)){
- Logger.log("regmatch" + u.length)
- if(u[0].toLowerCase().indexOf('unsubscribe')!==-1){
- for(var j = u.length-1; j >=0; j--){
- if(u[j].substring(0,4)=="http"){
- d.uns=u[j]
- break
- }
- }
- if(d.uns){
- break
- }
-
+ // If not found in raw content, search within the email body.
+ const body = msg.getBody();
+ const regex = /]*href=["'](https?:\/\/[^"']+)["'][^>]*>(.*?)<\/a>/gi;
+ let match;
+ while ((match = regex.exec(body)) !== null) {
+ if (match[2].toLowerCase().includes('unsubscribe')) {
+ unsubscribe = match[1];
+ break;
}
}
}
- data.push(d)
- } else {
- break
+
+ data.push({
+ permalink,
+ subject,
+ from: fromClean,
+ date: lastMsgDate,
+ email,
+ uns: unsubscribe
+ });
}
}
return data;
}
-
-//run every day to send summary email including emails from last 24 hours
+// Run every day to send a summary email including emails from the last 24 hours.
function noneroll() {
- emails = getEmails()
- if(emails.length > 0) {
- var date = Utilities.formatDate(new Date(), Session.getScriptTimeZone(), 'yyyy-MM-dd');
- var subject = "Bulk Emails: " + date;
- var t = HtmlService
- .createTemplateFromFile('email');
- t.data = emails;
- var hB = t.evaluate().getContent();
+ const emails = getEmails();
+ if (emails.length > 0) {
+ const date = Utilities.formatDate(new Date(), Session.getScriptTimeZone(), 'yyyy-MM-dd');
+ const subject = "Bulk Emails: " + date;
+ const template = HtmlService.createTemplateFromFile('email');
+ template.data = emails;
+ const htmlBody = template.evaluate().getContent();
+
MailApp.sendEmail({
to: Session.getActiveUser().getEmail(),
subject: subject,
- htmlBody: hB
+ htmlBody: htmlBody
});
}
}