SQL Join question

spebola · Jun 3, 2003

I am tring to join to tables in an SQL Select statement .

Table A contains the following Columns:

Contract No.
Transaction Month & Year

Table B contains the following Columns

Contract No.
Effective Month & Year

I have tried the following Select statements, but either get no rows selected or multiple rows.

SELECT * FROM Cont JOIN AcctTran ON TableA.Cont# = TableB.Cont# AND TableA.YM = TableB.EffYM

SELECT * FROM Cont JOIN AcctTran ON TableA.Cont# = TableB.Cont# AND TableA.YM >= TableB.EffYM

The first Select may not return a row if the effective date is prior to the transaction date, and the second may return duplicate rows from TableA if Table B has multiple prior effective dates..

Is there a way to limit the number of rows returned?

zy_abc · Jun 3, 2003

Can you give a bit more information with an example of the Table with the output.

wyrd · Jun 3, 2003

Remove the two #s from your SELECT statement and give it another try.

spebola · Jun 3, 2003

Table A
Cont # Year/Month
1215 200306

Table B
Cont # Eff Year/Month
1215 200303
1215 200305

For this sample data, the first Select statement returns 0 rows and the second Select statement returns 2 rows.

spebola · Jun 3, 2003

wyrd,

In the actual code I do have "No" instead of "#", just used # here out of habit.

zy_abc · Jun 3, 2003

How do you want the output to be?

wyrd · Jun 3, 2003

Hm. For curiosities sake put ( ) around your ON statement;

SELECT * FROM Cont JOIN AcctTran ON (TableA.ContNum = TableB.ContNum AND TableA.YM >= TableB.EffYM)

Also, what are the data types of your columns?

wyrd · Jun 3, 2003

Come to think of it Im assuming things here, which is always bad. Im assuming TableA = Cont and TableB = AcctTran? For future reference please copy/paste the exact SQL statement youre using (if possible that is, I know some work is for a business and is not allowed to be shared)

SELECT * FROM Cont JOIN AcctTran ON (Cont.ContNum = AcctTran.ContNum AND Cont.YM >= AcctTran.EffYM)

*bonks head* Scrolling up at your posts, you said this produced 2 rows and the other produced 0 (Cont.YM = AcctTran.EffYM)?

In the sample data you provided, it would indeed return 0 (for Cont.YM = AcctTran.EffYM) because there is no match for dates.

Also for the select statement I just provided (which is what you are using correctly) you will indeed get two rows back, because Cont.YM is >= both AcctTran.EffYM in your sample data.

You just need to change your JOIN logic if this is not the data you are looking to retrieve.

EDIT:
Re-read your original post again. Maybe Im not getting enough sleep LOL.

Anyway, you can limit the amount of rows returned by using TOP..

SELECT TOP 5 * FROM table

Was there another way you you were hoping to limit rows by?

spebola · Jun 3, 2003

I want one row selected, since I would be duplicating transactions. The purpose of the join is to associate contract data with the transaction. I just listed two fields because they are common in both tables. I know I can select from the transaction table and then, using a different data reader, read the appropriate row from the contract table, but I would like to do this in one data reader or data set. The problem is there is a row for every month in the tranaction table, but in the contract table there is a row only when something changes about the contract, hence creating a new effective date. For example, the price may change every 3 months, so I would have an entry in the contract table for 01/2003, 04/2003, 07/2003. The transaction table has an entry for each month. How do you associate the transaction for 6/2003 with the 4/2003 contract?

spebola · Jun 3, 2003

The ContNo field is integer and the date field is a string. Using the sample above, how would I change the join logic to retrieve just the 05/2003 date from the contract table and associate it with the 06/2003 transaction.

I really do appreciate the response. Sorry I left out so many details in the original post.

wyrd · Jun 3, 2003

Okay, I hope Im understanding now (youll have to forgive my pee brain as it moves slowly sometimes)

You have two tables, something like the following;

Table Name: Contracts
Column: Contract_ID (primary key)
Column: Description
Column: Price

Table Name: Transactions
Column: Transaction_ID (primary key)
Column: Contract_ID (foreign key)
Column: Date
Column: Price

Now, the Contracts table keeps tracks of contracts (duh), but the same contract may change in price several times, thus a single contract can have several rows of the same contract (but with different prices). Once every month, theres a Transaction for that contract, and thats what the Transaction keeps track of.

Am I getting this correctly? It sounds like you have an inefficient table design. You should never have duplicate data in a table. You should split this into 3 tables (dont worry Ill give a possible solution to duplicate data in a minute if re-designing is a little to extreme)

Table: Contracts
Column: Contract_ID (primary key)
Column: Description

Table: Contract_Prices
Column: Price_ID (primary key)
Column: Contract_ID (foreign key)
Column: Date
Column: Price

Table: Transactions
Column: Transacton_ID (primary key)
Column: Contract_ID (foreign key)
Column: Date
Column: Price

This way all unique contracts will have their own unique IDs (in the Contracts table). All contract prices will go inside the Contract_Prices table, and all transactions will go inside the Transactions table. Make sense? Then you can get whatever info. you want based off of a single contract id.

Now back to your duplicate data problem and selection problem. You want to select all info from the contract table, plus info from the transaction table?

SELECT Contract.Column, Transaction.Column FROM Contract JOIN Transaction ON (...)

Since everything (I hope) is based off of a unique contract number, then you can use GROUP BY to get a single row;

SELECT * FROM Contract JOIN Transaction ON (...) GROUP BY ContractNumber

To associate the transaction of 6/2003 with the 4/2003 contract, you can easily use the ContractNumber;

SELECT * FROM Contract JOIN Transaction ON (Contract.ContractNumber = Transaction.ContractNumber)

Which is what you were using above, which also makes me think Im not fully understanding, heh.

Heiko · Jun 3, 2003

It is still not clear to me, which one of the EffDates you want to see in those cases that there is more than one.

Maybe it will help if you chose this syntax:

SELECT
Distinct(A.Field1), Distinct(A.Field2), Max(B.Field3)
FROM
TableA as A,
TableB as B
WHERE
A.Cont = B.Cont

Just a wild guess.

wyrd · Jun 3, 2003

Originally posted by spebola
The ContNo field is integer and the date field is a string. Using the sample above, how would I change the join logic to retrieve just the 05/2003 date from the contract table and associate it with the 06/2003 transaction.

I really do appreciate the response. Sorry I left out so many details in the original post.

You need to change your date column to type DateTime. Using a string to store dates is bad, very bad. Also changing the column to DateTime will mean that youll have to store the days as well; 05/01/2003 for example. Matter of fact I just built a table that keeps track of dates on a monthly bases, much like what youre doing. You just need to make sure info is passed in as the first of the month, and also retrieved in the same manner, that way you can insure data integrity.

You dont need to apologize, I should of read your first post more slowly instead of skimming over it like the dipstick I am.

spebola · Jun 3, 2003

The effective date I am looking for is the one that applies to the transaction. In the above example, the 6/2003 transaction would need the contract that is effective 5/2003 not the 3/2003 effective date which was valid only for transactions in 3/2003 and 4/2003.

A word about the date (month & year only) in a string field, not a datetime field. When I started this project in January I put considerable thought in deciding how to store dates. I started using the datetime fields and it was akward. I switched to using a string and it has been much easier. I have a dll that will do any operation on a string date that you can do with a date from a datetime field. This dll works with either a 6 char date (yyyymm) or a 8 char date (yyyymmdd). I will never use a datetime field in any project.

As for your suggestion about table design, I have several tables representing a contract: Summary data, Receipt/Delivery, Pricing, Exhibit A and log. All these tables have ContNo and EffDate in common. It is the transaction table that does not have the effDate, just the date of the transaction.

My question is simply is there a way to write a SELECT statement joining the transaction with the appropriate entry in any of the Contract tables and not return a row for each effective date prior to the transaction date?

I have been a programmer for 37 years, long before SQL existed. One reason I have not used SQL before is I have not found a workable solution for this problem. But in using VB.Net or any PC languagre you have no choice but to use SQL. It seems to me you sholud be able to limit the right side of a join to one row and there may be a way and I just dont know it. Otherwise, I will have to find a work around.
SQL.

spebola · Jun 3, 2003

I found the soultion, the MAX function.

wyrd · Jun 3, 2003

The effective date I am looking for is the one that applies to the transaction. In the above example, the 6/2003 transaction would need the contract that is effective 5/2003 not the 3/2003 effective date which was valid only for transactions in 3/2003 and 4/2003.

My question is simply is there a way to write a SELECT statement joining the transaction with the appropriate entry in any of the Contract tables and not return a row for each effective date prior to the transaction date?

Okay, I think (finally, god forbid) I understand. Going back to the example data above;

Table A (Transaction Table)
Cont # Year/Month
1215 200306

Table B (Contract Table)
Cont # Eff Year/Month
1215 200303
1215 200305

Breaking down your description to hopefully understand...
- Looking for the date that applies to the Transaction (200306).
- We need the Contract that is effective for this Transaction, which is my understanding that it would be the Contract date closest to the Transaction date (200305 in this case, or 200303 if 200305 did not exist)
- Only return ONE Contract for the Transaction, and that ONE Contract is as described above.

If this is incorrect, please correct me. Assuming this is correct, the select statement would simply be;

SELECT TOP 1 Transaction.SomeColumn, Contract.SomeColumn FROM Transaction JOIN Contract (Transaction.ContractNumber = Contract.ContractNumber AND Transaction.Date >= Contract.Date) ORDER BY Contract.Date DESC

Please let me know if my explanation was correct, but the SELECT statement did not perform correctly. If it didnt perform correctly then please tell me what the output was, so I can attempt to fix it.

As with others who seem to get frustred, I always recommend going to the news group microsoft.public.sqlserver.programming as another source of help (I do all the time). Transact SQL professionals frequently answer questions on there. This is of course assuming that youre using SQL Server or MSDE as your database.

A word about the date (month & year only) in a string field, not a datetime field. When I started this project in January I put considerable thought in deciding how to store dates. I started using the datetime fields and it was akward. I switched to using a string and it has been much easier. I have a dll that will do any operation on a string date that you can do with a date from a datetime field. This dll works with either a 6 char date (yyyymm) or a 8 char date (yyyymmdd). I will never use a datetime field in any project.

It may be easy now, but it wont be in the future if you lose database integrity. You can insert anything into that varchar field of yours, and thats just bad. DateTime field insures a date in some fasion or another will come out of it. Also using functions to constantly parse dates from string format is some nice added overhead for yourself. Ew.

There is nothing hard about using a DateTime column in SQL. I strongly suggest you look into using it again, especially since there are techniques that you can use with SQL code to manipulate and parse DateTime columns to get full control of it.

wyrd · Jun 3, 2003

I found the soultion, the MAX function.

Can you show us the SELECT statement? Im a little curious as to how MAX() solved this problem, heh.

spebola · Jun 3, 2003

wyrd,

I really do appreciate your comments and the time you have spent responding to this post. Below is the select statement in the first version. I hope to eliminate the sub query portion in the Where clause and place in the join clause. I have not taken the time to learn how to place it in a white background, in fact I only recently started using these forums. Hope its readable. I cut and pasted it.

"SELECT * FROM AcctHist" & _
" JOIN Cont ON AC_ContNo = CN_ContNo AND AC_YYMM <= CN_AMYM" & _
" WHERE AC_CompCode = " & mCompCode & _
" AND AC_YYMM = " & mYYMM & " AND AC_ContNo >= " & sCont & _
" AND CN_AMYM = (SELECT MAX(CN_AMYM) FROM Cont WHERE AC_ContNo = CN_ContNo)" & _
" ORDER BY AC_ContNo, AC_RD, " & _
"AC_SubType, AC_LiftNo"

I am an old school programmer and the older I get the harder it is to change my ways. I am in a unique situation, in that I will always be the only programmer working on this project, I will own the marketing rights, and the source code will rarely be installed on a customers machine (they dont want to pay the price. You may be right on the date issue, but I think my date routines are just as efficient as the built in functions for datetime. I learned years ago that data integrity is the responsibility of the programmer not the user or the database.

FYI, the project I am writing in vb.net is the same system I have written for the IBM System/36 and rewritten for the AS/400, so this is the second re-write. I have marketed this system since 1983 and still support several on the System36 and the As/400.

Again, I appreciate your comments and time.

spebola · Jun 3, 2003

Sorry, I forgot to answer the questions in your previous post.
You have the concept down. The TOP keyword is new to me. At first glance it means the select would return 1 transaction with the correct contract row. Which would work great for 1 transaction. I havent mentioned this, but I any using the select to create a data reader to read all transactions for a given month.

I am trying to code this project where it can use either SQL Server 2000 or Oracle9i just by changing the connect string in the app.config file. The above select (with MAX function) works on both. I have one client that uses SQL Server and one that likes Oracle.

Thanks for the info on the sql news group. I will post the select there to see if anyone could rewrite it more efficiently.

wyrd · Jun 3, 2003

You can display code in white background using the code tags. Sort of like HTML tags except use brackets, the tag name is code.. (code)stuff(/code) (replace () with [])

Your MAX() solution is quite interesting. I never would of thought of it in a thousand years.

Code:

"SELECT * FROM AcctHist" & _
   " JOIN Cont ON AC_ContNo = CN_ContNo AND AC_YYMM <= CN_AMYM" & _
      " WHERE AC_CompCode = " & mCompCode & _
      " AND AC_YYMM = " & mYYMM & " AND AC_ContNo >= " & sCont & _
      " AND CN_AMYM = (SELECT MAX(CN_AMYM) FROM Cont WHERE AC_ContNo = CN_ContNo)" & _
   " ORDER BY AC_ContNo, AC_RD, AC_SubType, AC_LiftNo"

As for removing the subquery, this is a shot in the dark as I have nothing to test this against;

Code:

"SELECT TOP 1 * FROM AcctHist" & _
   " JOIN Cont ON AC_ContNo = CN_ContNo AND AC_YYMM <= CN_AMYM" & _
      " WHERE AC_CompCode = " & mCompCode & _
      " AND AC_YYMM = " & mYYMM & " AND AC_ContNo >= " & sCont & _
      " AND AC_ContNo = CN_ContNo" & _
   " ORDER BY AC_ContNo, AC_RD, AC_SubType, AC_LiftNo, CN_AMYM DESC"

To be perfectly honest, even if it did work, I doubt itd be all that much more efficient (or even at all) then what youve got. The reason is that it still retrieves a # of records then returns the top (or bottom) 1. You subquery pretty much does the same thing. Unless of course MAX() is doing something funky, like traversing the whole table to find the MAX value. I honestly dont know.

For Transact SQL (language for SQL Server), TOP N simply means return the specified number of rows at the top of the result set. So yes, in the SELECT statement I gave you, TOP 1 would give you a single row from the top of the result set. As you notice I also post ORDER BY date DESC at the end of the statement, which means that in specific case itd give you the last row in the statement. Am I making sense? Im almost positive theres an equivelent for this in Oracle, unfortunately I have no idea what it is.

When I first started REALLY looking at Transact SQL (two weeks ago LOL), I was like "whoa, I didnt know you could do that with database programming?!" So much to learn, so little time. I can only hope that my suggestions were of some use to you.

Now back to beating that dead horse of DateTime columns.

Ask on microsoft.public.sqlserver.programming about it (string vs DateTime for dates). Ask for 10 reasons why string is bad and DateTime is good. I know its probably a little late in the ball game to re-design your database (I know I certainly wouldnt, Im too lazy LOL), but for future reference.. you should at least hear it from the people who program SQL for a living, because I know the advice I give is about as important as a spec of salt in the ocean.

SQL Join question

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Similar threads